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Preface 


This  volume  is  a  result  from  the  collective  efforts  of  many  researchers  who  participated  in  an 
interdisciplinary  workshop  on  "Advances  in  Analogy  Research"  held  in  July  1998  at  the  Central 
and  Eastern  European  Center  for  Cognitive  Science  at  the  New  Bulgarian  University,  Sofia. 

The  purpose  of  the  workshop  has  been  to  stimulate  researchers  in  the  field  of  analogy  to  coop¬ 
erate  more  intensively  and  to  integrate  various  approaches  and  data  in  their  studies.  Its  aim  has 
been  to  advance  our  understanding  of  the  cognitive  mechanisms  of  analogy-making,  i.e.  how  peo¬ 
ple  notice/perceive  analogies,  how  they  retrieve  analogs  from  memory  or  how  they  construct  them, 
how  they  map  and  transfer  knowledge  from  one  domain  to  another,  how  they  combine  knowledge 
from  multiple  analogs  or  how  they  combine  analogy  with  rule-based  reasoning,  how  they  general¬ 
ize  and  learn  from  the  analogies  made,  how  they  use  analogies  for  problem  solving,  explanation, 
argumentation,  creation.  What  is  the  place  of  analogy  among  the  various  cognitive  processes,  such 
as  perception,  thinking,  memory,  learning,  etc.  What  is  the  role  of  analogy  in  human  development? 
Which  are  the  brain  structures  involved  in  analogy-making  processes?  What  kind  of  analogy- 
related  deficits  do  brain-damaged  patients  exhibit? 

This  workshop  has  been  highly  interdisciplinary  and  has  made  a  serious  attempt  to  integrate 
the  knowledge  researchers  have  accumulated  on  analogy-making  in  various  areas;  Artificial  Intel¬ 
ligence/Computational  Modeling,  Cognitive  Psychology,  Developmental  Psychology,  Neuropsy¬ 
chology,  Philosophy,  Cognitive  Linguistics,  as  well  as  various  applications  in  Design,  Legal  and 
Political  Reasoning,  Education,  etc.  A  serious  attempt  has  been  made  to  integrate  all  the  positive 
results  obtained  so  far  in  theories  of  analogy-making,  computational  modeling,  and  experimental 
work. 

This  has  been  a  unique  workshop  which  drew  together  most  of  the  key  researchers  in  the  field 
of  analogy  and  gave  them  the  chance  to  exchange  ideas,  share  visions,  and  form  friendships.  The 
workshop  has  attracted  about  70  participants  from  all  over  the  world  (25  participants  from  USA, 
10  from  France,  6  from  Germany,  5  from  UK,  3  from  Australia,  3  from  Ireland,  2  from  Canada,  2 
from  Japan,  2  from  Poland,  2  from  Belgium,  1  from  the  Netherlands,  1  from  Sweden,  1  from  New 
Zealand,  and  7  from  Bulgaria).  They  presented  59  papers,  including  14  key  talks,  30  talks,  and  15 
posters. 

We  would  like  to  thank  especially  all  the  key  speakers  and  presenters  for  their  valuable  contri¬ 
butions  to  the  success  of  the  workshop.  We  would  like  also  to  thank  the  local  organisers  Guergana 
Yancheva,  Iliana  Haralanova,  Ivailo  Milenkov,  Ivailo  Panov. 

We  wish  to  thank  the  following  for  their  contribution  to  the  success  of  this  workshop: 

Cognitive  Science  Society  -  USA,  Fulbright  Commission  -  Sofia,  MIT  Press  -  USA,  United 
States  Air  Force  European  Office  of  Aerospace  Research  and  Development. 
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ANALOGY  IN  A  PHYSICAL  SYMBOL  SYSTEM 

Keith  J.  Holyoak  John  E.  Hummel ' 

Department  of  Psychology  *  and  Brain  Research  Institute  ^ 
University  of  California,  Los  Angeles 
Los  Angeles,  C A  90095-1563  USA  i 

email:  holyoak® lifesci.ucla.edu,  jhummel@lifesci.ucla.edu 


Abstract:  Analogy,  and  relational  reason¬ 
ing  in  general,  depend  on  a  Phsyical  Symbol 
System  (PSS).  We  argue  that  the  biological  PSS 
that  underlies  human  (an  other  primate)  intelli¬ 
gence  is  based  on  mechanisms  for  dynamical¬ 
ly  and  independently  binding  fillers  to  roles, 
which  require  working-memory  representations 
maintained  by  dorsolateral  prefrontal  cortex. 
Our  approach,  termed  symbolic  connectionism, 
realizes  symbolic  processing  in  a  neural  net¬ 
work.  The  approach  is  instantiated  in  the  LISA 
model  (Hummel  &  Holyoak,  1997),  which  per¬ 
forms  analog  retrieval,  mapping,  inference,  and 
schema  induction.  LISA  makes  a  strong  distinc¬ 
tion  between  the  driver  analog,  which  is  acti¬ 
vated  sequentially  in  small  groups  of  proposi¬ 
tions,  and  the  recipient  analog,  which  passive¬ 
ly  responds  to  the  activity  of  the  driver.  The 
driver/recipient  distinction  leads  to  predictions 
about  asymmetries  and  grouping  effects  in 
mapping,  which  we  have  tested  and  confirmed. 
More  generally,  the  model  is  consistent  with 
recent  evidence  that  working-memory  resources 
are  required  for  more  complex  relational  map¬ 
pings,  and  that  relational  processing  depends 
on  the  dorsolateral  prefrontal  cortex. 

PHYSICAL  SYMBOL  SYSTEMS 

A  foundational  principle  of  modem  cogni¬ 
tive  science  is  the  Physical  Symbol  System 
hypothesis,  which  states  simply  that  human 
cognition  is  the  product  of  a  physical  symbol 
system  (PSS).  A  symbol  is  a  pattern  that  de¬ 
notes  something  else;  a  symbol  system  is  a  set 
of  symbols  that  can  be  composed  into  more 
complex  stmctures  by  a  set  of  relations.  The 
term  “physical”  conveys  that  a  symbol  system 


can  and  must  be  realized  in  some  physical  way 
in  order  to  create  intelligence.  The  physical 
basis  may  be  the  circuits  of  an  electronic  com¬ 
puter,  the  neural  substrate  of  a  thinking  biolog¬ 
ical  organism,  or  in  principle  anything  else  that 
could  implement  a  Turing  machine-like  coih- 
puting  device  (Newell,  1980, 1990;  Vera  &  Si¬ 
mon,  1993,  1994). 

Because  analogical  thinking,  like  other 
forms  of  relational  reasoning,  depends  on  com¬ 
posed  symbols  (propositions  specifying  rela¬ 
tions  between  the  elements  that  fill  specfic 
roles,  where  the  elements  may  themselves  be 
propositions),  it  necessarily  requires  a  PSS.  But 
what  sort  of  cognitive  architecture  could  im¬ 
plement  a  PSS?  The  fact  that  the  rnind  performs 
symbol  manipulation  is  important  in  constrain¬ 
ing  Marr’s  (1980)  computational  level,  but  it 
remains  to  be  determined  how  the  mind  per¬ 
forms  symbolic  computation,  which  is  a  ques¬ 
tion  at  the  level  of  representation  and  algorithm; 
and  also  how  the  PSS  is  realized  in  the  brain, 
which  is  a  question  at  the  level  of  implementa¬ 
tion.  That  is,  the  PSS  that  we  seek  to  under¬ 
stand  is  that  which  is  the  product  of  biological 
evolution. 

Both  analogy  and  the  PSS  that  underlies  it 
appear  to  be  late  evolutionary  developments. 
Relational  processing  appears  to  be  a  key  inno¬ 
vation  in  primate  intelligence  (see  Tomasello  & 
Call,  1997);  simple  relational  analogies  can  be 
solved  by  symbol-trained  chimpanzees  (Gillan, 
Premack  &  Woodruff.  1981;  Premack,  1983), 
and  more  complex  analogical  reasoning  is  a 
uniquely  human  capability  (Holyoak  &Thagard, 
1995).  A  great  deal  of  evidence  indicates  that 
the  prefrontal  cortex  is  a  key  component  of  the 
neural  substrate  for  the  PSS  (for  reviews  see 
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Grafman,  Holyoak  &  Boiler,  1995;  Shallice  & 
Burgess,  1991).  Reasoning  abilities  and  prefron¬ 
tal  cortex  have  developed  in  tandem  across  both 
phylogeny  and  ontogeny  (Benson,  1993).  Neu¬ 
ropsychological  studies  of  frontal  lobe  function 
indicate  that  prefrontal  cortical,  especially  in  the 
dorsolateral  prefrontal  cortex  (DLPFC),  dysfunc¬ 
tion  leads  to  selective  decrements  in  performance 
on  a  variety  of  complex  cognitive  tasks  that  de¬ 
pend  on  relational  processing.  The  DLPFC  is 
critical  to  working  memory,  to  which  relational 
reasoning  appears  to  be  intimately  connected. 
In  particular,  an  essential  role  of  working  mem¬ 
ory  in  reasoning  may  be  to  maintain  bindings 
between  roles  and  fillers  in  relational  represen¬ 
tations  (Robin  &  Holyoak,  1995).  Thus,  the 
DLPFC  may  be  a  major  component  of  the  neu¬ 
ral  system  that  implements  the  PSS,  and  hence 
analogical  reasoning. 

SYMBOLIC  CONNECTIONISM 

More  basic  than  the  issue  of  where  in  the 
brain  the  PSS  is  realized  is  the  issue  of  what 
types  of  computations  it  employs  to  perform 
symbol  manipulation.  To  address  this  issue,  we 
have  been  developing  a  neural-network  model 
of  analogy  called  LISA  (Learning  and  Inference 
with  Schemas  and  Analogies).  LISA  represents 
an  approach  to  building  a  PSS  that  we  term  sym¬ 
bolic  connectionism  (Hummel  &  Holyoak, 
1997,  in  press;  Holyoak  &  Hummel,  in  press). 
We  have  argued  that  one  basic  requirement  for 
a  PSS  is  the  ability  to  represent  roles  (relations) 
independently  of  their  fillers  (arguments), 
which  makes  it  possible  to  appreciate  what  dif¬ 
ferent  symbolic  expressions  have  in  common, 
and  therefore  to  generalize  flexibly  from  one 
to  the  other.  In  addition,  to  compose  symbols 
into  systematic  structures — and  to  appreciate 
how  those  structures  differ — it  is  necessary  to 
explicitly  bind  relational  roles  to  their  fillers. 
“Jim  loves  Mary”  differs  from  “Mary  loves 
Jim”,  not  in  the  representation  of  Jim,  Mary, 
and  loves,  but  in  the  binding  of  Jim  and  Mary 
to  roles  of  the  love  relation.  What  gives  a  sym¬ 
bolic  representation  its  power  is  precisely  this 
capacity  to  represent  roles  independently  of 


their  fillers  and  at  the  same  time  to  express  the 
binding  of  roles  to  fillers  dynamically — ^that  is, 
without  changing  the  representation  of  the  roles 
or  fillers  (Fodor  &  Pylyshyn,  1988;  Holyoak  & 
Hummel,  in  press). 

The  symbolic  connectionist  framework 
that  we  have  been  developing  seeks  to  realize 
these  properties  in  neural  networks.  Tradition¬ 
al  symbolic  representations  in  cognitive  sci¬ 
ence  (generally  in  predicate-calculus-style 
notations)  make  no  claim  to  be  neurally  plau¬ 
sible,  as  they  permit  arbitrary  operations  to 
create  and  move  symbols  freely  from  one 
structure  to  another.  Early  computational  mod¬ 
els  of  analogy,  such  as  SME  (Falkenhainer, 
Forbus  &  Centner,  1989)  and  ACME  (Holyoak 
&  Thagard,  1989),  were  based  on  traditional 
symbolic  representations,  which  render  them 
inadequate  as  psychological  and  neural  mod¬ 
els  (Hummel  &  Holyoak,  1997).  Neural  sys¬ 
tems,  which  disallow  such  arbitrary  opera¬ 
tions,  need  some  alternative  means  for  com¬ 
posing  invariant  representations  into  symbol¬ 
ic  structures — that  is,  for  dynamically  bind¬ 
ing  foies  to  their  fillers. 

Symbolic  connectionist  models  (Holyoak 
&  Hummel,  in  press;  Hummel  &  Holyoak, 
1997)  and  their  precursors  (Hummel  &  Bied- 
erman,  1992;  vonderMalsburg,  1981)use  syn¬ 
chrony  of  firing  for  this  purpose.  The  basic  idea 
is  that  if  two  elements  are  bound  together,  then 
the  neurons  (or  units  in  an  artificial  neural  net¬ 
work)  representing  those  elements  fire  in  syn¬ 
chrony  with  one  another;  critically,  elements 
that  are  not  bound  together  fire  out  of  synchro¬ 
ny.  For  example,  to  represent  “Jim  loves  Mary”, 
the  units  for  Jim  would  fire  in  synchrony  with 
the  units  for  lover,  while  Mary  fires  in  synchro¬ 
ny  with  beloved.  To  represent  “Mary  loves 
Jim”,  the  very  same  units  would  be  placed  into 
the  opposite  synchrony  relations,  so  that  Mary 
fires  in  synchrony  with  lover  while  Jim  fires  in 
synchrony  with  beloved. 

Symbolic  connectionism  represents  a 
striking  difference  (and,  we  would  argue,  a 
striking  advance)  over  traditional  symbolic  ar¬ 
chitectures  of  cognition  (e.g.,  Anderson,  1993; 
Rosenbloom  et  al.,  1991).  One  advantage  of 
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symbolic  connectionism  derives  from  an  ap¬ 
parent  weakness:  It  is  hard  to  do  symbol  ma¬ 
nipulation  in  a  connectionist  architecture.  This 
is  because  symbol  manipulation  requires  dy¬ 
namic  binding,  and  dynamic  binding  is  diffi¬ 
cult  to  perform  in  a  connectionist  architecture 
(Hummel  &  Stankiewicz,  1996,  in  press).  In 
the  case  of  dynamic  binding  by  synchrony  of 
firing,  some  mechanism  has  to  get  the  right 
units  into  synchrony  with  one  another  and 
(what  is  even  more  difficult)  keep  them  out  of 
synchrony  with  all  the  other  units.  It  takes 
work  to  establish  synchrony  and  (especially) 
asynchrony,  and  some  process  must  perform 
this  work.  A  likely  neural  system  for  perform¬ 
ing  such  operations  is  the  human  DLPFC. 

In  a  traditional  symbol  architecture,  by 
contrast,  bindings  are  unlimited  and  require 
no  special  capabilities.  Of  course,  a  theorist 
may  opt  to  impose  some  limit  on  binding, 
in  deference  to  the  glaring  fact  that  people 
have  limited  capacity  to  make  and  break  role 
bindings;  but  this  will  simply  be  an  ad  hoc 
“add  on”  rather  than  a  deep  implication  of 
the  proposed  symbolic  architecture.  In  con¬ 
trast,  a  model  that  represents  bindings  with 
synchrony  (e.g.,  LISA  and  related  models 
such  as  JIM;  Hummel  &  Biederitian,  1992; 
Hummel  &  Stankiewicz,  1996),  is  inherent¬ 
ly  limited  in  the  number  of  things  it  may  si¬ 
multaneously  have  active  and  mutually  out 
of  synchrony  with  one  another  (although 
there  is  no  theoretical  limit  on  the  number 
of  entities  in  any  one  synchronized  group). 
That  is,  there  is  a  limit  on  the  number  of 
distinct  bindings  such  a  model  may  have  in 
working  memory  at  any  one  time  (Hummel 
&  Holyoak,  1997;  Shastri  &  Ajjanaggade, 
1993).  Humans,  too,  have  limited  working 
memory  and  attention.  Symbolic  connec¬ 
tionism — as  an  algorithmic  theory  of  sym¬ 
bol  systems — thus  provides  a  natural  ac¬ 
count  of  the  fact  that  humans  have  a  limited 
working  memory  capacity. 

We  will  now  review  the  LISA  model,  and 
then  consider  recent  psychological  and  neural 
evidence  that  human  analogical  reasoning  is 
closely  tied  to  working  memory. 


THE  LISA  MODEL 

Analog  Representation,  Retrieval  and 
Mapping 

We  will  first  sketch  the  LISA  model  and 
its  approach  to  analog  retrieval  and  mapping. 
These  operations  are  described  in  detail  (along 
with  simulation  results)  by  Hummel  and  Ho¬ 
lyoak  (1997).  The  core  of  LISA’s  architecture 
is  a  system  for  actively  (i.e.,  dynamically)  bind¬ 
ing  roles  to  their  fillers  in  working  memory 
(WM)  and  encoding  those  bindings  in  LTM. 
LISA  uses  synchrony  of  firing  for  dynamic 
binding  in  WM  (Shastri  &  Ajjenagadde,  1993). 
Case  roles  and  objects  are  represented  in  WM 
as  distributed  patterns  of  activation  on  a  col¬ 
lection  of  semantic  units  (small  circles  in  Fig¬ 
ure  1);  case  roles  and  objects  fire  in  synchrony 
when  they  are  bound  together  and  out  of  syn¬ 
chrony  when  they  are  not. 

Every  proposition  is  encoded  in  LTM  by  a 
hierarchy  of  structure  units  (see  Figures  1  and 
2).  At  the  bottom  of  the  hierarchy  are  predicate 
and  object  units.  Each  predicate  unit  locally 
codes  one  case  role  of  one  predicate.  For  exam¬ 
ple,  lovel  represents  the  first  (agent)  role  of  the 
predicate  “love”,  and  has  bidirectional  excitato- 
ly  connections  to  all  the  semantic  units  repre¬ 
senting  that  role  (e.g.,  emotion  1,  strongl ,  posi- 


Figure  1.  Illustration  of  the  LISA  representation  of  the 
proposition  "love  (Jim,  Mary)". 
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tivel ,  etc.);  love2  represents  the  patient  role  and 
is  connected  to  the  corresponding  semantic  units 
(e.g.,  emotion2,  strong2,  positive2,  etc.).  Seman¬ 
tically-related  predicates  share  units  in  corre¬ 
sponding  roles  (e.g.,  love  I  and  likel  share  many 
units),  making  the  semantic  similarity  of  differ¬ 
ent  predicates  explicit.  Object  units  are  Just  like 
predicate  units  except  that  they  are  connected  to 
semantic  units  describing  things  rather  than  roles. 
For  example,  the  object  unit  Mary  might  be  con¬ 
nected  to  units  for  human,  adult,  female,  etc., 
whereas  rose  might  be  connected  to  plant,  flow¬ 
er,  and  fragrant. 

Sub-proposition  units  (SPs)  bind  roles  to 
objects  in  LTM.  For  example,  **Iove  (Jim, 
Mary)”  would  be  represented  by  two  SPs,  one 
binding  Jim  to  the  agent  of  loving,  and  the  oth¬ 
er  binding  Mary  to  the  patient  role  (Figure  1). 
The  Jim-f-agent  SP  has  bidirectional  excitatory 
connections  with  Jim  and  lovel,  and  the 
Mary-l-patient  SP  has  connections  with  Maiy 
and  love2.  Proposition  (P)  units  reside  at  the 
top  of  the  hierarchy  and  have  bidirectional  ex¬ 
citatory  connections  with  the  corresponding  SP 
units.  P  units  serve  a  dual  role  in  hierarchical 
structures  (such  as  “Sam  knows  that  Jim  loves 
Mary”),  and  behave  differently  according  to 
whether  they  are  currently  serving  as  the  “par¬ 
ent”  of  their  own  proposition  or  the  “child”  (i.e., 
argument)  of  another  (Hummel  &  Holyoak, 
1997).  It  is  important  to  emphasize  that  struc¬ 
ture  units  do  not  encode  semantic  content  in 


1  Analog  1 

Analog  ?  1 

ooooooooooooooo 

Semantic 

Figure  2.  Representation  of  the  ''loves  and  flowers" 
analogy.  Shapes  (triangle,  restangle,  etc,) 
correspond  to  classes  of  units  as  in  fig.  /.  Not  all 
connections  are  shown. 
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any  direct  way.  Rather,  they  serve  only  to  store 
that  content  in  LTM,  and  to  generate  (and  re¬ 
spond  to)  the  corresponding  synchrony  patterns 
on  the  semantic  units. 

The  final  component  of  LTSA’s  architec¬ 
ture  is  a  set  of  mapping  connections  between 
structure  units  of  the  same  type  in  different 
analogs.  Every  P  unit  in  one  analog  shares  a 
mapping  connection  with  every  P  unit  in  every 
other  analog;  likewise, 

SPs  share  connections  across  analogs,  as 
do  objects  and  predicates.  For  the  purposes  of 
mapping  and  retrieval,  analogs  are  divided  into 
two  mutually  exclusive  sets:  a  driver  and  one 
or  more  recipients.  Retrieval  and  mapping  arc 
controlled  by  the  driver. 

(There  is  no  necessary  linkage  between 
the  driver/recipient  distinction  and  the  more 
familiar  source/target  distinction.)  LISA  per¬ 
forms  mapping  as  a  form  of  guided  pattern 
matching.  As  P  units  in  the  driver  become  ac¬ 
tive,  they  generate  (via  their  SP,  predicate  and 
object  units)  patterns  on  the  semantic  units 
(one  pattern  for  each  role-argument  binding). 
The  semantic  units  are  shared  by  all  proposi¬ 
tions,  so  the  patterns  generated  by  one  propo¬ 
sition  will  activate  one  or  more  similar  prop¬ 
ositions  in  LTM  (analogical  access)  or  in  WM 
(analogical  mapping).  Mapping  differs  from 
retrieval  solely  by  the  addition  of  the  modifi¬ 
able  mapping  connections.  During  mapping, 
the  weights  on  the  mapping  connections  grow 
larger  when  the  units  they  link  are  active  si¬ 
multaneously,  permiting  LISA  to  learn  the 
correspondences  generated  during  retrieval. 
These  connection  w'eights  also  serve  to  con¬ 
strain  subsequent  memory  access.  By  the  end 
of  a  simulation  run,  corresponding  structure 
units  will  have  large  positive  weights  on  their 
mapping  connections,  and  non-corresponding 
units  will  have  strongly  negative  weights. 

Inference  and  Schema  Induction 

Augmented  with  intersection  discovery  and 
unsupervised  learning,  LISA’s  approach  to 
mapping  supports  inference  and  schema  induc¬ 
tion  as  a  natural  extension  (Hummel  &Holyoak, 
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1996).  Consider  the  previous  “love  and  flowers” 
analogs  (Figure  2).  During  mapping,  correspond¬ 
ing  elements  in  the  two  analogs  will  become 
active  simultaneously.  For  instance,  “love  (Jim, 
Susan)  will  fire  out  of  synchrony  (Figure  3a). 
Jim  shares  male  with  Bill,  and  Mary  shares  fe¬ 
male  with  Susan,  so  a  natural  proposition  to  in¬ 
duce  from  these  correspondences  is  “loves  (male, 
female)”  (Figure  3b).  To  induce  this  part  of  the 
schema,  it  is  necessary  to  (a)  make  explicit  what 
corresponding  elements  have  in  common,  and 
(b)  encode  those  common  elements  into  LTM 
as  a  new  proposition. 

LISA  performs  (a)  by  means  of  a  simple  type 
of  intersection  discovery.  Although  we  have  de¬ 
scribed  the  activation  of  semantic  units  only  from 
the  perspective  of  the  driver,  the  recipient  ana¬ 
log  also  feeds  activation  to  the  semantic  units. 
The  activation  of  a  semantic  unit  is  a  linear  func¬ 
tion  of  its  inputs,  so  any  semantic  unit  that  is 
common  to  both  the  driver  and  recipient  will 
receive  input  from  both  and  become  roughly 
twice  as  active  as  any  semantic  unit  receiving 
input  from  only  one  analog.  Common  semantic 
elements  are  thus  tagged  as  such  by  their  activa¬ 
tion  values. 

These  common  elements  are  encoded  into 
LTM  by  means  of  an  unsupervised  learning  al¬ 
gorithm.  In  addition  to  structure  units  represent¬ 
ing  the  known  source  and  target  analogs,  LISA 
has  a  collection  of  unrecruited  structure  units 
(i.e.,  units  with  random  connections  to  one  an¬ 
other  and  to  the  semantic  units)  that  reside  to¬ 
gether  in  a  third  “schema  analog”  (Figure  3). 
Unrecruited  predicate  and  object  units  have  in¬ 
put  thresholds  that  only  allow  them  to  receive 
input  from  highly  active  semantic  units  —  that 
is,  semantic  units  that  are  common  to  both  the 
driver  and  recipient  analogs.  Such  semantic  units 
are  depicted  in  dark  gray  in  Figure  3.  Without 
the  aid  of  an  external  teacher,  these  unrecruited 
schema  units  learn  to  respond  to  these  common 
elements  of  the  known  analogs.  Simultaneous¬ 
ly,  unrecmited  SP  units  learn  to  respond  to  spe¬ 
cific  conjunctions  of  predicate,  object,  and  (in 
the  case  of  hierarchical  propositions)  P  units,  and 
unrecruited  P  units  learn  to  respond  to  specific 
combinations  of  SP  units.  The  result  is  that  prop¬ 


ositions  describing  thecommon  elements  of  the 
known  analogs  are  encoded  into  LTM  as  a  third 
analog  —  a  schema.  Figure  3  illustrates  this 
process  for  one  proposition  in  the  “love  and 
flowers”  analogy. 

LISA  accomplishes  analogical  inference  by 
the  same  unsupervised  learning  algorithm  as 


(b)/  Analog  1 
.  love(Jim  Mary) 


T 


Analog  2 
Iove(BilI  Su.san) 


imiiii] 

Schema 


Figure  3.  Jim+love-agent  in  Analogl  activates 
Bitt+love-agent  in  Analog  2.  In  the  Schema,  predicate 
unit  1  is  recruited  for  love  agent,  and  object  unit  3  is 
recruited  for  the  intersection  of  Jim  and  Bill  ("human*' 
and  "male").  SP  4  is  recruited  for  human  male  (object 
3)  bound  to  love  agent  (pericdte  I).  Propositi  on  unit  3 
begins  to  be  recruited,  (b)  Mary-^love-patient  in  Analog 
1  activates  Susan+love-patient  in  Analog  2.  Predicate  4 
is  recruited  fro  love-patient;  object  1  is  recruited  for 
"human"  and  'female".  SP  7  is  recruited  for  the 
binding  of  predicate  4  and  object  I.  Propsition  unit  3 
now  codes  "love(human  male,  human  female)". 


13 

3 


Keith  J.  Holyoak,  John  E.  Hummel 


used  for  schema  induction,  except  that  the  un¬ 
recruited  units  reside  not  in  a  completely  sepa¬ 
rate  analog  (the  to-be-induced  schema),  but  in 
the  target  itself. 

WORKING  MEMORY  AND 

RELATIONAL  REASONING 

Grouping  Effects  and  Mapping  Asymmetries 

A  key  distinction  between  LISA  and  pre¬ 
vious  computational  models  is  its  emphasis  on 
the  role  of  working  memory  in  controlling  map¬ 
ping.  In  LISA,  mapping  is  a  directional,  capac¬ 
ity-limited  and  sequential  process.  The  direc¬ 
tional  aspect  of  mapping  follows  from  the  driv¬ 
er/recipient  distinction.  If  a  driver  analog  con¬ 
tains  more  propositions  than  WM  can  hold,  the 
propositions  must  be  fired  in  small  groups 
(roughly,  up  to  six  role  bindings,  or  2-3  propo¬ 
sitions,  at  a  time). 

The  role  of  WM  in  LISA’s  operation  leads 
to  predictions  about  the  influence  of  grouping 
propositions  on  the  performance  of  the  model 
(and  hence  people).  For  example,  if  the  text  of 
the  driver  analog  is  thematically  connected  (e.g., 
by  causal  relations),  then  mapping  may  be  more 
accurate  than  if  the  text  consists  of  causally 
unrelated  propositions,  This  prediction  was 
confirmed  in  a  study  by  Keane  (1997). 

Our  group  (Kubose,  Holyoak  &  Hummel, 
1997)  extended  Keane’s  procedure  to  demon¬ 
strate  that  the  impact  of  causal  structure  on  map¬ 
ping  is  inherently  asymmetrical.  Although  mod¬ 
els  such  as  as  SME  (Bowdle  &  Gentner,  1997) 
and  ACME  (Holyoak,  Novick  &  Melz,  1994) 
can  account  for  asymmetries  that  arise  in  post¬ 
mapping  analogical  inferences,  only  LISA  and 
the  lAM  model  (Keane,  Ledgeway  &  Duff, 
1994)  predict  asymmetries  in  the  mapping  stage 
itself  (and  only  LISA  predicts  asymmetries  as 
measured  by  mapping  accuracy).  In  LISA,  the 
driver  but  not  the  recipient  is  processed  sequen¬ 
tially,  and  hence  it  is  the  driver  that  is  sensitive 
to  groupings  of  propositions.  It  follows  that 
mapping  performance  with  isomorphic  analogs 
will  be  more  accurate  if  the  driver  analog  is 


causally  connected  and  the  recipient  analog  is 
not,  rather  than  vice  versa. 

Kubose  et  al.  manipulated  the  driver/recipi- 
ent  status  by  having  subjects  first  answer  ques¬ 
tions  about  one  or  the  other  analog,  and  then  ask¬ 
ing  directed  mapping  questions  (i.c.,  for  each 
object  and  relation  in  the  driver,  subjects  were 
asked  to  provide  the  corresponding  element  from 
the  recipient).  The  results  supported  EISA’s  pre¬ 
diction,  indicating  that  mapping  performance 
was  more  accurate  when  the  driver  analog,  rath¬ 
er  than  the  recipient,  had  causal  content. 

Other  experiments  supported  LISA’s  inter¬ 
pretation  of  causal  effects  on  mapping  as  being 
mediated  by  selective  grouping  of  propositions. 
If  neither  analog  was  thematic,  but  certain  prop¬ 
ositions  in  the  driver  were  optimally  grouped 
simply  by  drawing  a  box  around  them  and  ask¬ 
ing  subjects  to  consider  them  together,  map¬ 
ping  accuracy  was  improved. 

Other  recent  experiments  by  our  group 
(Grewall,  Law  &  Holyoak,  in  progress)  have 
tested  a  different  type  of  prediction  that  LISA 
makes  about  the  role  of  working  memory  in 
mapping.  In  accord  with  the  theory  of  relation¬ 
al  complexity  developed  by  Halford  and  his 
colleagues  (Halford  &  Wilson,  1980;  Halford, 
Wilson  &  Phillips,  in  press),  LISA  predicts  that 
the  complexity  of  mappings  is  constrained  by 
the  availability  of  WM  resources  to  maintain 
multiple  dynamic  role  bindings  concurrently. 
It  follows  that  if  WM  is  restricted  by  adding 
dual-task  requirements  (e.g.,  digit  memory 
load),  which  are  known  to  compete  for  WM 
capacity  (e.g..  Hitch  &  Baddeley,  1976;  Gil- 
hooly  et  al.,  1993),  the  ability  to  make  relation- 
ally  complex  mappings  will  be  impaired. 

In  order  to  determine  if  a  dual  task  will 
shift  the  preferred  basis  for  making  compari¬ 
sons,  we  asked  subjects  to  map  a  set  of  stimu¬ 
li  with  ambiguous  mapping.  These  stimuli, 
created  by  Markman  and  Gentner  (1993),  are 
pairs  of  pictures  (e  g.,  a  man  bringing  grocer¬ 
ies  to  a  woman;  a  woman  feeding  nuts  to  a 
squirrel)  in  which  one  element  of  the  first  pic¬ 
ture  (e.g.,  the  woman)  can  map  to  either  of  two 
elements  in  the  second  (the  woman,  on  the 
basis  of  perceptual  similarity,  or  the  squirrel, 
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based  on  the  shared  role  of  recipient-of-food). 
We  found  that  adding  a  dual  task  (concurrent 
digit  load)  caused  a  shift  from  relational  to 
more  direct  perceptual  similarity  as  the  basis 
of  mappings.  Such  a  shift  is  predicted  by  LISA 
because  finding  the  relational  match  is  more 
dependent  on  WM  resources,  which  are  re¬ 
duced  by  a  concurrent  memory  load. 

In  addition,  Tohill  and  Holyoak  (in  progress) 
have  shown  similar  reductions  in  relational  match¬ 
es  when  subjects*  anxiety  level  is  increased  prior 
to  the  mapping  task  (by  a  difficult  backwards- 
counting  task).  The  detrimental  impact  of  anxi¬ 
ety  on  relational  mapping  is  consistent  with  theo¬ 
ries  of  anxiety  that  emphasize  its  restrictive  im¬ 
pact  on  WM  resources  (Eysenck  &  Calvo,  1992). 

Neuropsychological  and  Neuroimaging 
Studies 

We  have  also  begun  to  investigate  the  neu¬ 
ral  locus  of  the  operations  that  support  relation¬ 
al  reasoning.  Investigations  by  our  group  have 
revealed  selective  deficits  in  relational  process¬ 
ing  in  tasks  similar  to  analogy,  such  as  simple 
variants  of  Raven’s  Progressive  Matrices  prob¬ 
lems  (see  Carpenter,  Just  &  Shell,  1990),  for 
patients  with  focal  degeneration  of  the  prefron¬ 
tal  cortex  (Waltz  et  al.,  in  press).  The  patients 
tested  were  diagnosed  with  frontotemporal  de¬ 
mentia  (FTD),  a  dementing  syndrome  resulting 
in  the  degeneration  of  anterior  regions  of  cortex 
(Brun  et  al.,  1994).  In  the  early  stages  of  FTD, 
the  degenerative  process  tends  to  be  localized  to 
either  prefrontal  or  anterior  temporal  cortical 
areas,  with  eventual  involvement  throughout  all 
cortical  regions  in  advanced  stages.  This  makes 
possible  the  division  of  patients  with  mild  FTD 
into  two  subgroups  of  patients.  In  the  frontal 
variant  of  FTD,  damage  is  initially  localized  in 
prefrontal  cortex.  Patients  with  the  temporal  vari¬ 
ant  of  FTD  often  exhibit  semantic  dementia, 
characterized  by  impairments  in  semantic  knowl¬ 
edge  (Graham  &  Hodges,  1997). 

Waltz  et  al.  (in  press)  found  that,  relative 
to  patients  with  damage  to  anterior  temporal 
cortex,  patients  with  degeneration  of  prefron¬ 
tal  cortex  show  dramatic  impairment  in  the 
ability  to  make  inferences  requiring  the  inte¬ 


gration  of  multiple  relational  representations. 
For  example,  performance  on  a  set  of  matrix 
problems  showed  striking  differences  between 
patients  with  damage  to  prefrontal  cortex  and 
those  with  damage  to  anterior  temporal  cor¬ 
tex  and  normal  controls  in  the  ability  to  inte¬ 
grate  multiple  relational  premises.  The  two 
patient  groups  did  not  differ  either  from  each 
other  or  from  normals  in  the  average  propor¬ 
tion  of  correct  responses  given  to  problems  not 
requiring  relational  integration  (i.e.,  problems 
with  variation  on  at  most  one  dimension). 
However,  on  problems  that  required  integra¬ 
tion  (those  with  variations  on  two  dimensions), 
the  patients  with  prefrontal  cortical  damage 
were  catastrophically  impaired  compared  to 
patients  with  anterior  temporal  lobe  damage 
as  well  as  normal  controls. 

To  complement  the  neuropsychological 
studies,  a  number  of  researchers  in  our  group  at 
UCLA  (Kroger,  Holyoak,  Bookheimer  &  Cohen; 
see  Kroger,  1998)  have  begun  to  perform  neu¬ 
roimaging  studies  to  investigate  the  neural  basis 
of  relational  processing  in  normal  college  stu¬ 
dents.  Previous  functional  imaging  studies  of 
reasoning  have  shown  involvement  of  the  same 
areas  of  cortex  as  are  activated  in  working-mem- 
oiy  tasks,  especially  DLPFC  (e.g.,  Prabhakaran 
et  al.,  1997),  but  have  not  systematically  manip¬ 
ulated  relational  complexity. 

We  have  constructed  materials  matched 
closely  in  terms  of  visuospatial  attributes,  but 
varying  in  relational  complexity  (Halford  & 
Wilson,  1980;  Halford  et  al.,  in  press).  A  pilot 
experiment  in  progress  uses  variants  of  Raven’s 
Progressive  Matrices  problems  which  vary  the 
number  of  relational  that  that  must  be  consid¬ 
ered  in  the  production  of  an  inductive  inference. 
These  problems  are  more  complex  versions  of 
the  matrix  problems  used  with  FTD  patients  by 
Waltz  et  al.  (in  press),  suitable  for  use  with  nor¬ 
mal  college  students.  In  pilot  work  in  progess, 
we  are  using  five  levels  of  relational  complex¬ 
ity.  Behavioral  data  show  increasing  reaction 
times  as  relational  complexity  increases,  con¬ 
firming  that  we  are  tapping  into  increasing  com¬ 
plex  cognitive  processes.  Initial  analyses  of  data 
from  the  first  subject  to  be  tested  reveal  that 
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activation  in  prefrontal  cortex  (but  not  parietal 
cortex)  increases  monotonically  with  relation¬ 
al  complexity  (Kroger,  1998), 

These  neuropsychological  and  initial  neu¬ 
roimaging  results  provide  support  for  our  hy¬ 
pothesis  that  relational  processing  may  form  the 
core  of  an  executive  component  of  prefrontal 
working  memory,  which  implies  both  the  ac¬ 
tive  maintenance  of  information  and  its  pro¬ 
cessing,  In  other  words,  relational  integration — 
and  specifically,  dynamic  variable  binding — 
may  be  the  ‘‘work”  done  by  working  memory. 
We  have  recently  begun  to  simulate  our  neu¬ 
ropsychological  findings  using  the  LISA  mod¬ 
el  (Holyoak  et  al.,  1998;  Hummel  et  al„  1998). 

CONCLUSION 

Symbolic  connectionism,  as  instantiated 
in  models  such  as  LISA,  offers  a  possible  ac¬ 
count  of  the  general  form  of  the  Physical 
Symbol  System  that  underlies  human  (and 
other  primate)  relational  reasoning,  LISA 
provides  a  solution  to  the  problem  (forceful¬ 
ly  posed  by  Fodor  &  Pylyshyn,  1988)  of  rep- 
revSenting  knowledge  over  a  distributed  set  of 
units  while  preserving  systematic  relational 
structure.  Like  previous  models  based  on  tra¬ 
ditional  symbolic  representations,  LISA  is 
able  to  retrieve  and  map  analogs  based  in 
large  part  on  structural  constraints.  But  in  ad¬ 
dition,  LISA  is  able  to  capitalize  on  its  dis¬ 
tributed  representations  of  meaning  to  inte¬ 
grate  analogical  mapping  with  a  flexible 
mechanism  for  analogical  inference  and  sche¬ 
ma  induction. 

A  key  aspect  of  LISA,  given  its  use  of  dy¬ 
namic  binding,  is  that  analogical  processing  (and 
relational  reasoning  in  general)  is  heavily  con¬ 
strained  by  working-memory  resources.  In  or¬ 
der  to  make  relationally  complex  mappings,  the 
reasoner  must  be  able  to  consider  multiple  role 
bindings  together.  We  can  now  begin  to  see  not 
only  what  mappings  are  “natural”  for  human 
reasoners,  but  also  how  they  may  be  computed 
in  neural  systems,  and  what  regions  of  the  brain 
are  necessary  for  performing  these  computations. 
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Given  the  key  importance  of  the  concept 
of  “similarity”  for  understanding  analogy,  the 
purpose  of  my  paper  will  be  to  investigate  a 
parallel  issue  -  the  role  of  similarity  in  under¬ 
standing  categorization. 

It  may  seem  almost  tautological  to  say  that 
we  categorize  the  world  into  categories  of  sim¬ 
ilar  objects,  persons  or  events.  Similarity  is  af¬ 
ter  all  merely  an  extension  of  the  notion  of 
“sameness”.  Similarity  may  just  be  sameness 
in  respect  of  a  particular  set  of  features  or  di¬ 
mensions.  So  I  may  be  similar  to  a  colleague  in 
working  for  the  same  organization,  having  the 
same  job  title,  or  having  the  same  number  of 
children.  Similarity  may  also  be  closeness  on  a 
continuous  dimension,  so  that  I  and  my  col¬ 
league  may  share  a  similar  colour  of  hair,  a  sim¬ 
ilar  salary  or  a  similar  personality. 

As  these  examples  quickly  illustrate,  while 
we  expect  categories  to  be  composed  of  simi¬ 
lar  elements,  there  is  a  major  difficulty  in  ex¬ 
plaining  categorization  in  terms  of  raw  simi¬ 
larity  defined  as  sameness  or  closeness  on  a  set 
of  dimensions.  The  problem  is  that  there  is  an 
indefinitely  large  number  of  such  dimensions, 
and  there  could  therefore  be  any  number  of  rea¬ 
sons  for  placing  two  items  in  the  same  catego¬ 
ry  and  any  number  of  reasons  for  placing  them 
in  different  categories. 

The  idea  that  we  classify  together  those 
things  that  we  find  similar  has  had  a  chequered 
history  in  psychology.  While  there  was  consid¬ 
erable  theoretical  and  empirical  interest  in  the 
development  of  similarity-based  classification 
models  in  the  1970s,  particularly  with  Rosch 
and  Mervis*  prototype  theory,  and  Medin  & 
Schaffer’s  Exemplar  model,  (Medin  &  Shaf¬ 


fer,  1978;  Rosch,  1975),  subsequently  the  field 
has  split  into  two  very  distinct  camps.  On  the 
one  hand  increasingly  sophisticated  computa¬ 
tional  models  have  been  developed  to  explain 
how  people  learn  classifications  on  the  basis 
of  similarity.  Most  notable  in  this  area  are  de¬ 
velopments  of  exemplar  storage  models  based 
on  Medin  and  Shaffer’s  context  model.  The 
models  assume  that  we  encode  stimuli  in  a 
multi-dimensional  similarity  space,  and  learn 
classifications  through  one  of  a  number  of  pos¬ 
sible  algorithms.  In  Nosofsky’s  Generalized 
Context  Model  (Nosofsky,  1988)  similarity  of 
a  new  stimulus  is  computed  to  all  the  stored 
exemplars  of  each  category  that  has  been 
learned,  and  a  choice  rule  determines  the  like¬ 
lihood  of  classification  in  a  particular  catego¬ 
ry.  In  Ashby  and  Gotf  s  (1988)  Decision  Bound 
approach,  the  space  is  divided  up  by  hyper¬ 
planes  that  delimit  the  boundaries  where  the 
probability  of  belonging  in  one  category  equals 
that  of  belonging  in  its  neighbour.  These  dif¬ 
ferent  models  have  been  shown  to  provide  an 
excellent  fit  to  a  range  of  experimental  data  in 
classification  and  recognition  tasks. 

Meanwhile,  researchers  in  higher  level  cog¬ 
nition  have  questioned  the  degree  to  which  the 
notion  of  similarity  is  sufficiently  clearly  de¬ 
fined  and  well  enough  constrained  to  serve  as 
an  explanation  of  how  we  actually  carve  up  and 
categorize  the  real  world  around  us,  as  opposed 
to  the  artificial  stimulus  worlds  dreamt  up  by 
psychologists  devising  their  experiments.  In 
particular  there  is  the  major  concern  of  finding 
an  independently  motivated  account  of  why  we 
attend  to  particular  dimensions  of  our  environ¬ 
ment  rather  than  others.  Similarity-based  cate- 
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gorization  can  only  be  made  to  work  given  a 
specification  of  relevant  dimensions.  It  is  per¬ 
fectly  possible  for  dimensional  weights  to  be 
adapted  to  the  distribution  of  stimuli  in  order 
to  maximise  the  coherence  of  categories  in  the 
space,  and  there  is  evidence  that  this  does  hap¬ 
pen.  But  the  selection  of  dimensions  from 
which  to  start  is  a  far  from  trivial  issue. 

In  this  talk,  I  will  discuss  arguments  and 
review  evidence  for  and  against  basing  catego¬ 
rization  on  similarity,  and  conclude  that,  con¬ 
strued  broadly,  similarity  may  still  have  a  key 
role  to  play  in  explaining  how  most  of  our  con¬ 
ceptual  categories  function. 

SIMILARITY-BASED 

CATEGORIZATION 

What  is  the  evidence  that  similarity  plays 
a  role  in  categorization?  To  answer  this  ques¬ 
tion  we  need  to  be  quite  precise  about  what  we 
mean  by  similarity.  We  form  categories  of  many 
different  kinds  in  the  course  of  everyday  cog¬ 
nition,  and  it  could  be  claimed  that  they  are  ail 
based  on  similarity.  But  this  would  be  to  ren¬ 
der  the  notion  so  broad  as  to  be  empty  or  more 
probably  circular. 

To  begin  with  examples  of  categories  that 
are  not  good  candidates  for  a  similarity-based 
account,  Barsalou  (1983)  pointed  to  the  exist¬ 
ence  of  what  he  termed  ad  hoc  categories  such 
as  Birthday  Presents  for  Your  Mother,  or  Things 
to  Take  on  a  Camping  Trip.  Members  of  these 
categories  are  of  course  similar  in  one  impor¬ 
tant  respect  —  things  to  take  on  a  camping  trip 
are  all  similar  in  as  much  as  they  are  all  good 
things  to  have  along  when  camping.  But  this 
tautological  similarity  does  not  go  far  in  ex¬ 
plaining  how  this  category  is  constructed.  Nor 
does  it  appear  that  the  degree  to  which  some¬ 
thing  is  a  good  member  of  the  category  is  relat¬ 
ed  in  any  way  to  its  similarity  to  other  mem¬ 
bers  in  any  respect  other  than  its  property  of 
being  in  the  category. 

Another  class  of  categories  which  could 
only  tautologically  be  explained  in  terms  of 
similarity  is  the  class  of  concepts  with  explic¬ 
it  definitions.  Thus  belonging  to  the  concep¬ 


tual  category  of  Triangle  depends  on  a  small 
number  of  explicit  criteria,  such  that  only  sim¬ 
ilarity  in  those  respects  is  relevant  to  class 
membership.  To  say  that  all  triangles  are  sim¬ 
ilar  to  each  other  in  respect  of  having  three 
straight  sides,  three  angles,  and  internal  an¬ 
gles  that  sum  to  180®  is  to  say  little  more  than 
that  all  triangles  possess  all  these  properties. 
At  the  same  time,  the  ratios  of  the  three  sides 
or  the  three  angles  may  affect  perceived  simi¬ 
larity  of  actual  triangles,  but  are  clearly  of  no 
relevance  to  the  issue  of  category  membership. 
Thus  similarity  reduces  to  identity  in  certain 
restricted  respects,  while  other  respects  arc 
treated  as  totally  irrelevant.  Categories  of  this 
kind  are  clearly  not  based  on  similarity,  ex¬ 
cept  in  a  purely  tautological  sense.  Similarity 
must  mean  more  than  simple  identity  on  a 
particular  set  of  dimensions,  and  there  should 
be  some  independent  justification  for  treating 
otherwise  salient  dimensions  as  being  irrele¬ 
vant  to  categorization. 

By  contrast,  we  form  many  other  categories, 
many  of  them  stable  and  long-term  parts  of  our 
conceptual  repertoire,  which  do  show  a  strong 
primn  facie  link  to  similarity.  These  categories 
are  characterized  by  having  no  explicit  defini¬ 
tion  (unlike  ad  hoc  categories  or  explicitly  de¬ 
fined  categories),  a  number  of  associated  prop¬ 
erties  which  ^re  generally  true  of  category  mem¬ 
bers,  although  not  universally  so,  and  a  graded 
structure  such  that  some  items  are  more  clearly 
and  uncontroversially  members  of  the  category 
than  are  others.  Rosch  and  Mervis  ( 1 975)  termed 
these  concepts  “family  resemblance”  or  Proto¬ 
type  Concepts.  Prototypes  arc  ideal  or  central 
tendencies  around  which  categories  form.  The 
category  is  then  composed  of  all  items  that  arc 
sufficiently  similar  to  the  prototype  (for  a  for¬ 
mal  treatment  sec  Harhpton,  1995a).  Prototype 
theory  answers  the  key  question  of  how  dimen¬ 
sions  are  selected  by  proposing  that  our  biolog¬ 
ical  inheritance  and  social  and  cultural  environ¬ 
ment  provide  the  dimensions  along  which  we 
note  similarity  and  difference.  Where  a  number 
of  these  dimensions  correlate  in  our  experience, 
then  a  category  of  similar  items  is  formed,  to 
which  we  give  a  name,  and  which  we  can  then 


20 


The  role  of  similarity  in  how  we  categorize  the  world 


use  as  a  concept  in  our  thinking  and  language. 
Once  the  dimensions  have  been  determined, 
clustering  of  the  world  into  classes  is  relatively 
automatic.  Indeed  there  are  advanced  statistical 
theories  of  how  items  may  be  clustered  based 
on  partially  correlated  dimensions  (van 
Mechelen  et  al.,  1993). 

There  are  several  iterative  feedback  loops 
in  this  process.  For  an  individual  learning  the 
categories  of  his  or  her  culture,  the  first  attempts 
to  understand  the  relevant  dimensions  may  be 
incorrect  and  may  need  refinement  through  er¬ 
ror  correction.  Keil  and  Batterman’s  (1984) 
study  of  the  Characteristic-to-Defining  shift  in 
young  children  shows  just  this  type  of  effect. 
Younger  children  took  account  of  more  percep¬ 
tually  striking  dimensions  in  making  categori¬ 
zation  judgements  about  concepts  such  as  Is¬ 
land,  Uncle  or  Lunch,  while  the  older  children 
had  homed  in  on  the  correct  concepts  as  deter¬ 
mined  by  adult  usage  of  the  words. 

At  the  cultural  level,  in  order  to  obtain  a 
cleaner  and  more  generally  useful  set  of  cate¬ 
gories,  the  weights  of  dimensions  get  adjusted 
or  new  dimensions  are  constructed  as  concepts 
evolve.  The  reason  that  younger  children  have 
to  adapt  their  concepts  to  pick  up  these  more 
hidden  or  subtle  conceptual  distinctions  is  that 
to  suit  its  purposes  our  culture  has  developed 
concepts  based  on  a  deeper  level  of  structure 
containing  more  relational  information  and  less 
dependent  on  mere  appearance. 

It  is  at  this  point  in  the  story  that  a  number 
of  psychologists  have  argued  that  something 
other  than  mere  similarity  and  feature  weights 
must  be  playing  a  role.  Part  of  our  drive  for 
knowledge  and  understanding  is  the  search  to 
replace  similarity-based  clusters  based  on  per¬ 
ceptual  appearance  by  explicitly  defined  con¬ 
cepts  with  broad  explanatory  power.  Keil 
(1989)  refers  to  this  as  the  principle  of  “origi¬ 
nal  sim”  —  that  children’s  initial  concepts  are 
based  on  pure  similarity,  which  is  then  replaced 
in  time  with  deeper,  more  theory-like  kinds  of 
conceptual  understanding. 

A  paradigm  example  of  this  process  can  be 
seen  in  the  progress  of  medical  science.  When 
medical  research  first  tackles  a  phenomenon  it 


defines  a  syndrome  —  a  cluster  of  symptoms, 
and  conditions  Of  occurrence,  with  some  predic¬ 
tive  value  in  terms  of  treatment  and  prognosis. 
(Most  mental  illnesses  are  at  this  stage  of  under¬ 
standing.)  It  is  characteristic  of  syndromes  that 
cases  may  be  more  or  less  typical,  and  more  or 
less  clear  members  of  the  syndrome.  Frequently 
cases  may  arise  that  are  borderline  to  the  syn¬ 
drome,  possessing  some  similarity  to  typical 
cases,  but  not  enough  to  be  clearly  identifiable 
as  an  example.  Discovery  of  an  aetiology  linked 
to  the  syndrome  —  such  as  an  infectious  organ¬ 
ism,  a  genetic  marker,  or  an  identifiable  biochem¬ 
ical  malfunction  —  will  usually  allow  the  syn¬ 
drome  to  be  replaced  by  a  clearly  defined  dis¬ 
ease  or  condition  category,  with  its  own  set  of 
diagnostic  tests.  Note  that  the  set  of  patients  and 
their  symptoms  has  not  changed  —  the  world 
has  not  become  more  clear-cut  in  any  way.  How¬ 
ever  whereas  before  a  case  was  borderline  be¬ 
cause  it  showed  marginal  levels  of  similarity  to 
other  cases,  a  case  will  now  be  borderline  if  the 
critical  diagnostic  tests  do  not  come  out  with  a 
clear  answer.  There  is  a  shift  from  an  uncertain¬ 
ty  which  is  conceptual  in  its  origin,  to  an  uncer¬ 
tainty  which  is  epistemological  —  that  is  to  say 
that  a  case  is  now  borderline  because  we  cannot 
discover  clearly  enough  whether  the  defining 
agent  is  at  work.  Our  uncertainty  has  to  do  with 
our  state  of  knowledge  in  the  particular  case, 
rather  than  our  state  of  understanding  of  such 
cases  in  general. 

This  extended  analogy  with  medical  sci¬ 
ence  serves  as  a  template  for  the  debate  that 
followed  publication  of  Murphy  and  Medin’s 
(1985)  attack  on  similarity  as  a  basis  for  natu¬ 
ral  concepts.  Physicians  seek  to  explain  the 
presenting  symptoms  through  a  causal  ac¬ 
count.  In  an  analogous  fashion,  Murphy  and 
Medin  argued  that  we  use  our  concepts  as  ways 
of  explaining  the  world  to  ourselves  and  oth¬ 
ers.  To  take  one  of  their  examples,  if  we  see 
someone  jump  fully  clothed  into  a  swimming 
pool  at  a  party,  we  may  categorize  them  as 
drunk.  We  do  not  have  to  do  this  by  compar¬ 
ing  their  behaviour  to  similar  examples  of 
drunken  behaviour  that  we  have  seen  in  the 
past  (although  actually  this  might  be  how  we 
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do  it),  but  according  to  Murphy  and  Medin, 
we  can  make  the  categorization  by  looking  for 
the  category  that  best  provides  an  expfanato- 
ry  account  of  the  behaviour  that  we  are  see¬ 
ing.  Such  a  process  for  categorizing  through 
causal  or  explanatory  **mini-theories”  is  a 
much  more  powerful  means  of  categorizing, 
as  it  is  possible  to  use  it  to  categorize  exam¬ 
ples  that  are  far  removed  from  any  familiar 
experiences  that  we  may  have  had  in  the  past. 
According  to  this  account,  the  dimensions  on 
which  we  categorize  are  themselves  deter¬ 
mined  by  a  deeper  causal  explanatory  theory 
which  links  the  observable  facts  to  a  deeper 
underlying  cause,  and  so  makes  the  whole  cat¬ 
egory  a  coherent  set.  The  crucial  point  that 
Murphy  and  Medin  make  is  that  to  determine 
that  a  particular  drunken  behaviour  is  similar 
to  other  examples  of  drunken  behaviour  seen 
previously  requires  that  we  can  specify  in  just 
what  respects  that  similarity  is  measured.  But 
the  only  way  to  do  this  is  to  have  a  theory  of 
what  effect  alcohol  has  generally  on  behav¬ 
iour.  The  determination  of  similarity  depends 
on  the  theory,  and  so  cannot  itself  play  an  ex¬ 
planatory  role  in  the  categorization. 

It  follows  from  this  critique  that  we  cate¬ 
gorize  not  on  the  basis  of  a  similarity  cluster 
(akin  to  a  syndrome),  but  on  the  basis  of  select¬ 
ing  the  concept  that  best  explains  the  instance 
to  be  categorized  (as  in  a  disease  category).  This 
alternative  account  of  categorization  has  also 
had  wide  acceptance  in  the  developmental  field 
(Keil,  1989). 

The  difference  between  similarity  and  ex¬ 
planation-based  or  “causal  theory”  accounts  of 
categorization  was  brought  into  sharp  focus  in 
a  paper  by  Rips  (1989).  Rips  attacked  the  un¬ 
constrained  nature  of  similarity  as  a  basis  for 
categorization,  and  reported  a  number  of  dem¬ 
onstrations  of  cases  where  the  similarity  ac¬ 
count  clearly  fails.  Each  of  these  demonstra¬ 
tions  involved  the  discovery  of  a  non-mono- 
tonic  dissociation  in  the  relation  between  sim¬ 
ilarity  and  categorization.  If  categories  are 
formed  around  prototypes,  then  it  should  not 
be  the  case  that  one  item  could  be  more  similar 
(or  more  typical)  of  the  category  than  another, 


but  yet  less  likely  to  belong.  In  forma!  terms, 
this  means  that  there  should  be  a  monotonic 
function  relating  similarity  to  a  category  and 
membership  in  that  category.  Rips  provided 
three  cases  where  this  constraint  was  broken. 

In  his  first  case,  subjects  were  asked  to 
consider  a  hypothetical  item  that  was  exact¬ 
ly  half  way  between  two  categories,  one  a 
fixed  category  and  the  other  a  variable  cate¬ 
gory.  For  example  they  had  to  imagine  an 
object  that  was  half  way  between  the  largest 
US  quarter  they  had  seen  and  the  smallest 
pizza  they  had  seen.  Subjects  then  judged 
whether  this  object  was  either  (a)  more  sim¬ 
ilar  to  or  typical  of  one  category  rather  than 
the  other,  or  (b)  more  likely  to  be  a  member 
of  one  category  rather  than  the  other.  Rips 
reported  a  dissociation  between  similarity 
and  typicality  on  the  one  hand,  where  people 
generally  considered  similarity  to  be  about 
equal  to  each  category,  and  likelihood  of 
membership  on  the  other  hand,  where  peo¬ 
ple  generally  judged  the  object  more  likely 
to  be  in  the  variable  category  (the  pizza  in 
this  case).  Since  similarity  to  the  two  catego¬ 
ries  was  equal,  but  categorization  was  strong¬ 
ly  biased  in  favour  of  one,  Rips  argued  that 
categorization  behaviour  was  dissociated 
from  similarity. 

Rips*  second  example  involved  a  creature 
(or  artifact)  which  mctamoiphosed  into  some¬ 
thing  else.  For  example  a  bird-like  creature  was 
transformed  into  an  insect-like  creature  through 
an  environmental  accident.  When  asked  wheth¬ 
er  it  was  more  similar  to  or  more  typical  of  a 
bird  as  opposed  to  an  insect,  people  went  for 
the  insect  category.  However  when  asked  which 
type  of  creature  it  was  more  likely  to  be,  they 
judged  the  creature  (marginally)  more  likely  to 
be  a  bird.  Once  again  there  was  a  dissociation 
in  that  whereas  similarity  pointed  to  categori¬ 
zation  in  one  category  (insect),  actual  categori¬ 
zation  preferences  were  for  placing  the  crea¬ 
ture  In  the  other  (bird). 

The  third  example  was  reported  in  a  paper 
by  Rips  and  Collins  (1993).  Subjects  were  giv¬ 
en  information  about  the  shapes  of  two  (non¬ 
normal)  distributions  of  values  on  some  dimen- 
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sion  -  for  example  daily  maximum  temperatures 
for  two  particular  locations.  They  were  then 
given  particular  values  and  asked  to  judge  their 
typicality  as  an  example  of  each  distribution, 
or  asked  to  say  which  distribution  the  item  was 
more  likely  to  belong  to.  Under  these  condi¬ 
tions,  people  tended  to  base  similarity  judg¬ 
ments  on  distance  from  some  measure  of  cen¬ 
tral  tendency.  Likelihood  of  categorization 
however  was  based  on  a  more  extensional  form 
of  reasoning,  employing  intuitive  statistical  rea¬ 
soning  to  find  the  more  likely  category. 

There  is  no  space  in  this  paper  to  go  into  a 
detailed  discussion  of  the  validity  of  Rips’ 
three  cases  of  non-monotonicity  (but  see 
Hampton,  1997,  for  a  fuller  discussion).  What 
is  clear  is  that  dissociations  between  typicali¬ 
ty  and  category  membership  can  be  demon¬ 
strated  albeit  with  relatively  non-standard 
types  of  material.  The  first  case  asked  people 
to  imagine  an  object  which  is  specified  only 
by  its  size.  The  second  involved  a  creature 
whose  appearance  changed,  but  about  whose 
internal  organs  and  genetic  make-up  subjects 
were  told  nothing,  and  the  third  case  involved 
presenting  subjects  with  strong  cues  to  employ 
extensional  reasoning  using  relative  frequen¬ 
cies  in  their  category  judgments.  (Physicians 
are  familiar  with  the  phenomenon  of  cases  that 
may  resemble  condition  A  more  than  condi¬ 
tion  B,  but  where  the  extreme  rareness  of  con¬ 
dition  A  means  that  a  diagnosis  of  condition 
B  is  more  likely  to  be  correct.) 

One  aspect  that  all  three  demonstrations 
share  is  a  presupposition  that  categorization  is 
in  fact  all-or-none.  Thus  the  object  was  either  a 
coin  or  a  pizza,  it  was  either  a  bird  or  an  insect, 
and  either  from  one  distribution  or  the  other. 
The  categorization  task  was  always  presented 
to  the  subject  as  one  in  which  the  correct  cate¬ 
gorization  had  to  be  predicted  on  the  basis  of 
the  available  information.  As  noted  earlier,  this 
presupposition  is  antithetical  to  the  similarity- 
based  approach  where  the  correctness  of  a  cat¬ 
egorization  is  not  something  that  can  always 
be  resolved.  Some  items  are  by  their  nature 
borderline  to  a  class,  and  no  further  explora¬ 
tion  would  reveal  their  true  nature  any  better. 


EVIDENCE  FOR  SIMILARITY  IN 
CATEGORIZATION 

In  the  light  of  these  various  critiques  of  sim¬ 
ilarity-based  categorization  it  is  worth  briefly 
reviewing  the  evidence /<9r  the  prototype  mod¬ 
el.  First  there  is  the  fuzziness  of  many  of  our 
concepts.  When  asked  to  reflect  on  the  mean¬ 
ing  of  words  like  “fish”,  “art”,  or  “sport”,  peo¬ 
ple  find  it  very  hard  to  give  a  theoretically  sat¬ 
isfactory  account  of  the  underlying  concepts. 
They  are  however  very  good  at  generating  ways 
in  which  members  of  the  category  differ  from 
other  things  in  the  same  domain.  They  can  also 
quickly  recall  or  create  examples  to  illustrate 
what  a  typical  category  member  might  be.  There 
is  apparently  a  rich  source  of  semantic  infor¬ 
mation  associated  with  the  concept,  but  it  does 
not  appear  to  be  organized  in  anything  like  the 
neat  structures  proposed  by  the  opponents  of 
prototype  theory.  The  lack  of  organization  and 
internal  coherence  becomes  particularly  clear 
when  people’s  reasoning  with  concepts  has 
been  studied.  Hampton  (1982)  showed  that  peo¬ 
ple  may  quite  willingly  agree  (for  example)  that 
School  Furniture  is  a  type  of  Furniture,  and  that 
a  blackboard  is  a  type  of  School  Furniture,  but 
yet  disallow  that  a  blackboard  is  a  type  of  Fur¬ 
niture.  Categorization  was  not  treated  as  a  uni¬ 
versally  transitive  relation,  in  contradiction  of 
both  classical  and  even  fuzzy  logic  (Zadeh, 
1965).  Instead,  I  argued  that  each  separate  cat¬ 
egory  judgment  was  made  on  the  basis  of  sim¬ 
ilarity.  As  the  basis  on  which  similarity  chang¬ 
es  between  the  two  judgments,  it  is  then  quite 
possible  to  obtain  intransitive  categorizations. 

Tversky  and  Kahneman  (1983)  found  sim¬ 
ilar  effects  on  subjective  probability  judg¬ 
ments.  They  found  that  people  used  similari¬ 
ty  to  prototype  as  a  means  of  judging  subjec¬ 
tive  likelihood,  even  when  this  strategy  pro¬ 
duced  clearly  illogical  results,  such  as  judg¬ 
ing  it  more  likely  that  a  radical  female  student 
would  have  become  a  feminist  bank  teller,  than 
that  she  would  simply  have  become  a  bank 
teller.  This  conjunction  fallacy  was  paralleled 
by  the  finding  of  overextension  of  conjunc¬ 
tive  categories  by  Hampton  (1988).  People 
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were  willing  to  say  for  example  that  Chess  was 
a  Sport  which  is  a  Game,  even  though  they 
had  earlier  judged  that  Chess  was  not  a  Sport. 
Hampton  (1996a)  replicated  this  result  with  a 
between-subjects  design,  and  extended  the 
demonstration  of  inconsistent  classification  to 
the  case  of  negation.  For  example  80%  of  par¬ 
ticipants  in  one  group  considered  Tree  Hous¬ 
es  to  be  Buildings,  yet  100%  of  participants 
in  another  group  considered  them  to  be  Dwell¬ 
ings  that  are  not  Buildings.  Our  conceptual 
categories  display  a  degree  of  flexibility  and 
context  sensitivity  which  is  much  more  easily 
captured  by  a  similarity-based  process  than  by 
a  fixed  theoretical  schema.  A  recent  study  by 
Sloman  (1997)  is  a  further  demonstration  of 
how  similarity  can  be  shown  to  affect  people’s 
reasoning.  In  one  demonstration,  Sloman 
found  that  people  were  more  likely  to  accept 
the  truth  of  a  logically  necessary  conclusion 
when  the  two  premises  were  similar  than  when 
they  were  not.  Similarity  apparently  pervades 
people’s  attempts  to  reason  logically,  and  a 
very  simple  explanation  for  this  finding  is  that 
our  conceptual  system  is  heavily  dependent 
on  similarity-based  conceptual  processes. 

A  critical  test  of  similarity-based  catego¬ 
rization  is  the  extent  to  which  categorization 
can  be  influenced  by  “irrelevant”  kinds  of 
similarity.  There  is  a  distinction  in  the  litera¬ 
ture,  originally  introduced  by  Smith,  Shoben 
and  Rips  (1974),  between  Defining  and  Char¬ 
acteristic  Features.  It  was  their  notion  that 
there  were  many  properties  of  objects  which 
might  determine  how  typical  they  were  of 
their  class,  but  which  would  be  irrelevant  to 
their  category  membership.  Their  example 
was  that  the  ability  to  fly  is  very  typical  of 
birds,  and  so  flying  birds  are  more  typical 
members  of  their  class.  Flight  as  such  how¬ 
ever  is  irrelevant  to  determining  whether  a 
creature  is  a  bird  or  not,  since  there  are  both 
birds  that  do  not  fly  and  other  creatures  (no¬ 
tably  insects)  that  do  fly.  Smith  et  al.  termed 
this  idea  the  Characteristic  Feature  Hypoth¬ 
esis.  Hampton  (1995b)  set  out  to  test  wheth¬ 
er  Characteristic  Features  (CF)  are  in  fact 
always  irrelevant  to  categorization  in  prac¬ 


tice.  To  test  this  idea,  I  created  sets  of  six 
hypothetical  objects  for  each  of  a  number  of 
concepts.  Each  object  either  possessed  or 
lacked  a  full  set  of  CF.  In  addition  each  ob¬ 
ject  either  had  a  full  set  of  Defining  Features 
(DF+),  lacked  at  least  one  Defining  Feature 
[DF-),  or  had  a  partial  match  to  the  Defining 
Features  [DF?].  The  aim  of  the  experiment 
was  first  to  show  that  when  the  object  pos¬ 
sessed  the  DF,  categorization  would  be  clear¬ 
ly  positive,  and  when  it  lacked  at  least  one 
DF,  then  it  would  be  clearly  negative,  regard¬ 
less  of  the  CF.  The  critical  test  was  then  to 
be  whether  the  CF  would  affect  categoriza¬ 
tion  when  the  DF  were  only  partially 
matched.  For  example  consider  an  object 
which  partially  matched  the  DF  of  umbrel¬ 
las  -  it  was  designed  to  keep  things  from  fall¬ 
ing  on  you,  but  instead  of  protecting  you  from 
the  rain  it  was  intended  to  protect  you  from 
acorns  and  twigs  when  picnicking  under  a 
tree.  Would  this  odd  object  be  more  likely  to 
be  categorized  as  an  umbrella  if  it  had  the 
classical  domed  shape  and  material  of  um¬ 
brellas,  than  if  it  was  built  in  some  different 
shape  and  material? 

In  the  event  this  critical  second  test  could 
not  easily  be  performed.  The  reason  was  that 
it  proved  very  hard  (even  after  four  replica¬ 
tions  of  the  experiment  with  improved  mate¬ 
rials  and  improved  instructions),  to  find  CF 
which  did  not  still  influence  categorization, 
even  when  the  DF  were  clearly  present  or  ab¬ 
sent.  For  example  one  example  of  DF+,  CF- 
was  the  following  description: 

“The  offspring  of  tw'o  zebras,  this  creature 
was  given  a  special  experimental  nutritional  diet 
during  development.  It  now'  looks  and  behaves 
just  like  a  horse,  with  a  uniform  browm  color  ” 

When  asked  if  this  w'as  really  a  zebra, 
only  a  third  of  the  subjects  agreed,  the  rest 
ignoring  the  genotype  in  favor  of  the  pheno¬ 
type,  contrary  to  the  assumptions  of  both  bi¬ 
ological  theory  and  psychological  essential- 
ism.  Similar  problems  occurred  when  I  at¬ 
tempted  to  pit  the  intended  function  of  arti¬ 
facts  (assumed  to  reflect  their  real  nature) 
against  their  outw'ard  appearance.  People 
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tended  to  be  influenced  by  similarity  along 
dimensions  which  logical  analysis  suggests 
should  be  irrelevant  —  unless  of  course  cat¬ 
egorization  is  based  on  similarity  calculated 
across  a  wide  range  of  dimensions. 

Returning  to  the  critique  offered  by  Rips 
(1989),  an  unpublished  study  by  Hampton  & 
Estes  attempted  partially  to  replicate  Rips* 
transformation  study.  We  felt  that  the  design 
of  the  original  study  may  have  encouraged 
subjects  to  dissociate  the  similarity/typicali¬ 
ty  and  categorization  judgrrients,  simply  be¬ 
cause  both  questions  were  always  asked  to¬ 
gether  after  every  scenario.  We  modified  the 
procedure  in  a  number  of  ways,  the  main  one 
of  which  was  to  have  different  groups  of  stu¬ 
dents  making  judgments  of  typicality  or  judg¬ 
ments  of  categorization.  The  startling  find¬ 
ing  was  that  the  dissociation  completely  dis¬ 
appeared.  There  were  no  differences  in  the 
mean  ratings  for  typicality  or  categorization 
in  any  of  the  conditions.  Thus  when  the  crea¬ 
ture  was  not  yet  transformed  it  was  uniform¬ 
ly  rated  as  typical  of,  and  likely  to  belong  in, 
the  initial  category.  After  the  transformation, 
both  typicality  and  categorization  switched 
to  the  final  category.  When  the  nature  of  the 
transformation  was  changed  from  an  “acci¬ 
dental”  change  induced  by  environmental 
pollution,  to  a  “natural”  change  due  to  bio¬ 
logical  maturation,  both  typicality  and  cate¬ 
gorization  judgments  showed  some  degree  of 
switch  towards  the  final  category,  but  there 
was  still  no  dissociation. 

In  a  further  unpublished  study  by  Hamp¬ 
ton  &  Hainitz,  using  a  similar  design,  we  var¬ 
ied  whether  the  transformation  affected  just 
the  surface  external  appearance  (through  sur¬ 
gical  intervention)  or  affected  the  deeper  in¬ 
ternal  biology  of  the  creature  (through  envi¬ 
ronmental  pollution).  The  degree  to  which  the 
creature  was  believed  to  have  changed  was 
greater  for  the  deep  transformation  than  for 
the  surface  one,  and  there  was  a  greater  shift 
towards  the  final  category  for  the  typicality 
judgements  than  for  categorization.  Yet  there 
was  still  no  evidence  of  a  clean  dissociation 
in  the  two  judgments. 


DISSOCIATING  CATEGORIZATION 

AND  SIMILARITY  IN  NATURAL 
CATEGORIES 

According  to  the  Prototype  Model,  catego¬ 
rization  proceeds  by  assessing  the  similarity  of 
an  instance  or  subclass  to  the  concept  proto¬ 
type,  and  then  testing  whether  it  passes  some 
threshold  criterion  for  category  membership.  If 
this  model  is  inadequate,  then  as  Rips  (1989) 
argued,  it  should  be  possible  to  demonstrate 
non-monotonicity  between  measures  such  as 
typicality  or  similarity  to  prototype  (on  the  one 
hand)  and  likelihood  of  category  membership 
(on  the  other).  Hampton  (1997)  set  out  to  dis¬ 
cover  to  what  extent  non-monotonicity  of  this 
kind  could  be  found  in  everyday  common  se¬ 
mantic  categories.  Rips  (1989)  used  a  variety 
of  unusual  examples  to  dissociate  similarity  and 
categorization,  and  it  is  questionable  how  gener- 
alizable  such  results  are  to  the  more  usual  pro¬ 
cess  of  deciding  if  subclass  A  is  a  member  of 
category  B.  It  is  therefore  interesting  to  know 
whether  categorization  in  a  common  category 
such  as  Fish  or  Vehicle  follows  typicality  in 
the  category,  or  whether  dissociations  between 
the  measures  can  be  found.  To  answer  this  ques¬ 
tion,  I  reanalyzed  a  data  set  published  in  1978 
by  McCloskey  and  Glucksberg,  in  which  they 
had  two  groups  of  subjects  making  judgments 
about  18  semantic  categories.  One  group  were 
asked  to  make  typicality  judgments  for  a  list  of 
30  items  for  each  category,  ranging  from  clear 
category  members  to  clear  non-members.  A 
second  group  gave  a  simple  Yes/No  categori¬ 
zation  decision  about  each  item  for  each  cate¬ 
gory.  This  second  group  returned  a  month  later 
and  made  their  categorization  decisions  a  sec¬ 
ond  time.  McCloskey  and  Glucksberg  (1978) 
found  that  the  categorizations  showed  fuzziness 
in  two  respects.  First,  there  was  considerable 
disagreement  amongst  people  over  which  items 
should  be  included  in  the  categories  and  which 
should  not.  This  disagreement  was  reflected  in 
a  large  number  of  items  with  Categorization 
Probability  at  intermediate  levels  between  0  and 
1.  Second,  there  Was  a  considerable  degree  of 
within-subject  inconsistency  when  the  follow- 
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up  test  was  made.  High  levels  of  disagreement 
and  inconsistency  were  most  noticeable  for 
items  in  the  middle  of  the  typicality  scale  — 
that  is  for  items  that  were  neither  clear  mem¬ 
bers  nor  clear  non-members.  McCloskey  and 
Glucksberg  concluded  that  categorization  in 
many  semantic  categories  is  fuzzy,  rather  than 
all-or-none,  and  that  there  is  a  considerable 
amount  of  instability  in  how  we  categorize. 

The  data  from  this  research  were  published 
as  an  Appendix,  and  provided  an  opportunity  to 
test  for  non-monotonicity  directly.  Typicality 
ratings  are  prima  facie  direct  measures  of  how 
similar  an  instance  or  class  is  to  the  category 
prototype.  The  instructions  for  typicality  empha¬ 
size  that  a  high  rating  should  be  given  to  items 
that  are  representative  or  good  examples  of  the 
class  as  a  whole.  On  the  other  hand  Categoriza¬ 
tion  Probability  is  a  simple  way  of  measuring 
the  degree  to  which  something  is  categorized  in 
a  class.  If  we  assume  that  there  are  random  and 
individual  sources  of  variation  in  categorization, 
then  the  group  measure  of  how  many  subjects 
say  X  is  in  category  Y  may  be  taken  as  a  fairly 
direct  measure  of  the  degree  to  which  X  is  con¬ 
sidered  to  belong  in  Y  by  each  individual. 

The  data  were  therefore  analyzed  in  order 
to  examine  the  mathematical  relationship  be¬ 
tween  mean  rated  typicality  and  categorization 
probability.  Technical  details  can  be  found  in 
Hampton  (1996b).  The  first  conclusion  was  that 
there  were  clear  differences  between  individual 
categories  in  terms  of  how  clearly  categoriza¬ 
tion  probability  could  be  predicted  from  typi¬ 
cality.  For  example,  Sport  showed  a  clear  thresh¬ 
old  function,  with  practically  no  systematic  de¬ 
viation  from  the  expected  pattern  of  categoriza¬ 
tion  probability  rising  with  typicality.  For  Fish 
on  the  other  hand,  there  was  a  considerable 
spread  of  items  above  and  below  the  threshold 
function,  and  plenty  of  evidence  for  non -mono¬ 
tonicity.  There  was  no  link  however  between 
how  well  the  measures  correlated  and  the  kind 
of  semantic  domain.  There  were  good  and  bad 
fits  in  both  natural  kind  and  artifact  categories. 

In  order  to  explore  the  various  possible  rea¬ 
sons  why  some  items  should  not  follow  a  clean 
threshold  function  but  instead  should  be  scat¬ 


tered  above  and  below  the  function,  a  regression 
function  was  fitted  to  the  data  from  all  17  cate¬ 
gories,  (one  category  was  excluded  for  technical 
reasons),  and  the  residual  categorization  proba¬ 
bility  was  calculated  for  each  item.  The  items 
with  categorization  probability  significantly 
higher  or  lower  than  that  expected  for  their  typ¬ 
icality  were  examined  in  more  detail,  and  a  num¬ 
ber  of  hypotheses  suggested  themselves  to  ac¬ 
count  for  the  variation.  First,  there  were  a  num¬ 
ber  of  very  unfamiliar  items  such  as  Eugicna,  or 
Lamprey,  which  had  categorization  probability 
higher  than  expected  from  Typicality.  Typicali¬ 
ty  ratings  are  known  to  be  affected  by  familiari¬ 
ty  (Barsalou,  1985;  Hampton  &  Gardiner,  1983). 
It  is  therefore  quite  likely  that  low  familiarity 
with  an  item  may  depress  its  Typicality  without 
affecting  its  categorization. 

On  the  other  hand  there  were  items  with 
lower  categorization  probability  than  expect¬ 
ed,  which  appeared  to  be  semantically  associ¬ 
ated  with  the  category,  but  not  actually  calcgo- 
ly  members.  Examples  were  Orange  Juice  as  a 
Fruit,  or  Egg  as  an  Animal.  Bassok  and  Mcdin 
(1997)  have  shown  that  semantic  associated- 
ness  can  give  a  sense  of  similarity,  and  it  is  not 
unreasonable  to  suppose  that  Typicality  ratings 
may  also  reflect  associatedness  to  an  extent  that 
is  not  seen  in  categorization  itself. 

Two  further  hypotheses  were  related  to  the 
distinction  that  Rips,  Keil  and  others  have 
stressed  —  namely  the  distinction  between  the 
surface  appearance  of  objects,  and  their  deeper 
nature.  Some  items  bear  a  superficial  resem¬ 
blance  to  a  category'  to  which  they  do  not  be¬ 
long  —  a  whale  as  a  Fish  is  perhaps  the  best 
known  example.  Other  items  bear  little  resem¬ 
blance  to  the  category  to  w'hich  they  do  belong 
—  as  might  be  the  case  for  tomatoes  and  Fruit. 
It  may  be  expected  that  items  that  are  techni¬ 
cally  not  members  should  have  lower  category 
probability  than  expected,  while  those  with  arc 
only  technically  members  should  have  higher 
probability  than  expected. 

A  final  hypothesis  concerned  the  effect  of 
contrast  categories  on  typicality  and  categoriza¬ 
tion.  Similarity  to  a  prototype  may  be  calculated 
without  regard  to  any  contrasting  or  overlapping 
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categories  of  which  the  item  may  be  a  member. 
Categorization  however  may  proceed  in  a  more 
contrastive  manner,  in  that  people  may  prefer  to 
categorize  each  item  in  just  one  category  (as  in 
the  mutual  exclusivity  principle,  adopted  by 
young  children  in  word  learning — Clark,  1973). 
If  an  item  is  a  better  member  of  some  contrast¬ 
ing  or  overlapping  category,  then  perhaps  its 
categorization  probability  would  be  less  than 
expected  from  its  typicality. 

These  various  hypotheses  were  collected 
together  and  tested  in  a  rating  questionnaire 
which  was  administered  to  twenty  students  at 
the  University  of  Chicago.  From  this  question¬ 
naire,  variables  were  computed  for  each  item, 
corresponding  to  its  Unfamiliarity,  the  degree 
to  which  it  was  Only  Technically  a  member,  or 
Technically  Not  a  member,  the  degree  to  which 
it  was  judged  a  Part  or  Property  rather  than  a 
true  member,  and  the  degree  to  which  it  also 
belonged  in  a  Contrast  category.  These  five  new 
variables  were  entered  into  a  regression  to  pre¬ 
dict  residual  categorization  probability  when 
the  effect  of  Typicality  had  been  removed.  Four 
of  the  five  variables  proved  to  be  significant 
predictors,  in  the  expected  direction.  Items  that 
were  Unfamiliar,  or  were  Only  Technically 
members,  were  associated  with  positive  resid¬ 
uals  —  they  were  more  likely  to  be  categorized 
positively  than  warranted  by  their  typicality. 
Items  that  were  associated  parts  or  properties, 
or  that  were  Technically  Not  members  were 
associated  with  negative  residuals  —  they  were 
less  likely  to  be  categorized  positively  than  was 
warranted  by  their  typicality.  The  Contrast  vari¬ 
able  had  no  overall  predictive  effect  on  residu¬ 
al  categorization  probability. 

A  subsequent  analysis  compared  the  4  bio¬ 
logical  categories  (Animal,  Bird,  Fish  and  In¬ 
sect),  with  the  5  artifact  categories  (Clothing, 
Furniture,  Kitchen  Utensil,  Ship  and  Vehicle). 
It  was  found  that  the  two  “Technical”  predic¬ 
tors  were  significant  for  the  biological  catego¬ 
ries,  but  not  for  the  artifacts.  On  the  other  hand, 
the  Contrast  category  predictor  was  significant 
only  for  the  artifact  categories.  This  difference 
is  consistent  with  the  fact  that  people  may  be 
influenced  by  biological  classification  in  the 


zoological  categories,  but  that  no  correspond¬ 
ing  theory  exists  for  artifacts.  Similarly,  arti¬ 
facts  often  fall  into  overlapping  categories  (a 
knife  may  be  either  a  tool,  a  weapon  or  a  kitch¬ 
en  utensil),  whereas  biological  categories  are 
usually  mutually  exclusive.  Hampton  (1997) 
concluded  that  there  were  few  systematic  devi¬ 
ations  from  monotonicity  and  many  of  them 
could  be  accounted  for  by  the  effects  of  famil¬ 
iarity  or  associatedness  on  typicality  ratings. 
There  was  also  evidence  that  typicality  gives 
less  weight  to  “technical”  or  deeper  aspects  of 
objects  than  does  categorization. 

WHAT  ROLE  DOES  SIMILARITY 
PLAY? 

In  this  paper  I  have  suggested  that  similar¬ 
ity-based  categorization  is  in  fact  a  widespread 
phenomenon,  affecting  not  only  the  common 
everyday  use  of  categories,  but  also  people’s 
reasoning  processes  about  those  categories.  It 
would  be  foolish  to  argue  that  all  of  our  cate¬ 
gories  are  constructed  on  the  basis  of  putting 
similar  things  together.  We  would  certainly 
have  made  little  progress  culturally  or  scientif¬ 
ically  if  our  conceptual  repertoire  were  limited 
to  such  categories.  How  then  can  the  evidence 
for  similarity-based  categorization  be  squared 
with  this  notion  that  our  concepts  should  not 
be  based  on  similarity? 

There  are  two  issues  here  to  be  kept  sepa¬ 
rate.  The  first  is  that  the  world  contains  impor¬ 
tant  distinctions  that  are  not  always  immedi¬ 
ately  obvious  in  the  outward  appearance  of 
objects.  Two  mushrooms  may  be  very  similar, 
but  whereas  one  makes  a  tasty  meal,  the  other 
is  deadly  poisonous.  A  crude  view  of  similari¬ 
ty-based  categorization  would  argue  that  we 
could  never  learn  this  distinction,  since  it  would 
require  forming  a  category  that  cuts  across  the 
way  things  appear  to  us  perceptually.  This  view 
is  to  take  perceptual  (in  fact  usually  visual)  sim¬ 
ilarity  as  the  only  meaningful  way  of  defining 
similarity.  Perceptual  similarity  is  indeed  a  very 
powerful  and  salient  factor  in  our  thinking,  and 
it  probably  represents  the  “prototypical”  or  de¬ 
fault  way  in  which  we  understand  similarity. 
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(It  was  not  so  much  the  principle  of  similarity- 
based  categorization  that  Rips  (1989)  was  at¬ 
tacking,  so  much  as  the  notion  of  categoriza¬ 
tion  based  on  resemblance  in  appearance.) 

There  is  however  a  more  powerful  way  to 
treat  similarity,  in  which  any  dimension  may 
enter  into  the  computation  of  similarity.  We 
might  then  talk  of  “deep  similarity**  as  opposed 
to  “surface  similarity**.  If  some  subtle  morpho¬ 
logical  characteristic  of  the  mushrooms  provid¬ 
ed  a  clear  predictor  of  the  effects  of  eating  them, 
then  this  characteristic  would  be  given  a  very 
high  weight  in  the  computation  of  similarity  for 
the  purpose  of  culinary  cla.ssification.  After  all 
there  is  very  little  similarity  in  the  effects  of 
eating  the  two  mushrooms,  and  this  factor 
would  be  sufficiently  important  to  carry  great 
weight  in  determining  categorization. 

The  first  point  is  therefore  that  similarity 
must  be  broadened  to  encompass  a  range  of 
semantic  information  that  goes  well  beyond  the 
perceptual  appearance  of  objects.  When  this  is 
properly  understood,  it  is  clear  for  example  why 
whales  should  not  be  fish.  When  examined 
more  closely,  when  their  behavior  is  observed 
and  their  internal  organs  (lungs,  warm  blood, 
brains  etc.)  are  inspected,  their  similarity  to  oth¬ 
er  mammals,  and  dissimilarity  from  fish  be¬ 
comes  quite  obvious.  There  is  no  need  for  a 
theory  of  evolution  to  make  this  observation, 
just  a  curiosity  about  the  way  things  arc. 

The  second  point  is  that  over  and  above  the 
ability  to  use  similarity  as  the  basis  for  catego¬ 
rization,  we  have  the  capacity  to  think  in  a  more 
precise  logical  fashion.  We  can  define  explicit 
terms  such  as  Prime  Number  or  Triangle,  or  we 
can  define  explicit  goals  to  be  satisfied  (as  in 
Barsalou’s  ad  hoc  categories).  If  told  a  catego¬ 
rization  rule,  we  can  readily  apply  it  to  the 
world,  and  indeed  there  is  a  growing  body  of 
results  which  suggests  that  if  a.sked  to  invent  a 
categorization  scheme  we  have  a  strong  bias 
for  rules  ba.sed  on  single  dimensions  (Mcdin, 
Wattenmaker  &  Hamp.son,  1987). 

This  type  of  more  axiomatic  thought  has  ob- 
viou.sly  led  to  the  huge  success  of  mathematics 
and  the  mathematical  sciences,  and  by  its  nature 
it  makes  little  use  of  similarity.  Scientific  con¬ 


cepts  tend  to  form  all-or-nonc  cjitegorics,  which 
can  enter  into  logical  relations  and  scientific  laws 
with  absolute  certainty.  Before  the  days  of  nu¬ 
merical  taxonomy,  it  was  considered  an  essential 
requirement  for  classification  schemes  that  they 
should  be  based  on  monothctic  criteria.  Debate 
centred  on  which  were  the  most  appropriate  di¬ 
mensions  or  features  with  which  to  create  sub¬ 
classes,  and  the  value  of  a  classification  was  to  be 
found  in  the  theoretically  interc.sting  generalisa¬ 
tions  that  it  permitted  one  to  make. 

What  should  be  obvious  to  most  psycholo¬ 
gists  who  have  attempted  to  study  this  more 
“advanced**  type  of  thought  is  that  it  is  actually 
very  difficult  for  most  people.  School  teachers 
have  to  spend  hours  and  hours  of  patient  expla¬ 
nation  to  get  the  majority  of  students  to  under¬ 
stand  the  principles  of  mathematics  or  scien¬ 
tific  laws  and  their  concepts,  and  the  majority 
of  the  population  never  succeed  in  mastering 
the  necessary'  skills  in  more  than  a  rudimentary' 
form.  From  the  earliest  days  of  experimental 
psychology  it  has  been  shown  that  people  arc 
poor  at  following  the  ab.stract  logic  of  syllo¬ 
gisms,  conditionals,  or  probability.  They  arc 
also  poor  at  using  analogy  in  problem  solving 
unless  surface  similarity  helps  to  cue  the  ap¬ 
propriate  connection.  Arguments  that  .similari¬ 
ty-based  categorization  is  inadequate  since  it 
cannot  form  a  solid  foundation  of  concepts  for 
logic  and  reasoning  arc  therefore  founded  on  a 
dubious  premise  —  namely  that  most  people 
have  such  a  foundation  readily  available  to 
them.  It  is  perhaps  more  realistic  to  suppose 
that  similarity  forms  the  basis  of  most  people’s 
concepts  most  of  the  time,  and  that  .some  indi¬ 
viduals,  with  a  lot  of  training  and  with  the  ad¬ 
vantage  of  the  cultural  transmission  of  ideas 
from  great  thinkers  of  the  past  arc  able  to  de¬ 
velop  more  advanced  thinking  skills  in  partic¬ 
ular  domains.  Dimly  remembered  lessons  may 
lead  us  to  believe  that  our  concepts  arc  clearer 
than  they  really  arc  —  or  to  defer  to  experts  as 
keepers  of  the  truth.  However  for  everyday  pur¬ 
poses  we  are  content  to  continue  putting  togeth¬ 
er  things  that  are  (superficially  or  deeply)  sim¬ 
ilar.  After  all,  such  a  system  serv'es  us  perfectly 
well  for  most  daily  purposes. 
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A  similar  point  can  be  made  about  those 
concepts  that  reflect  deeper  theoretical  infor¬ 
mation,  such  as  many  biological  or  natural  kind 
terms.  Through  the  evolution  of  our  culture  and 
its  interest  in  scientific  knowledge,  we  have  (as 
a  culture)  developed  sophisticated  concepts 
such  as  mammal,  vertebrate  or  insect,  and  the 
proper  definition  of  these  terms  requires  edu¬ 
cated  attention  to  scientifically  relevant  dimen¬ 
sions  of  the  creatures  in  question,  and  may  of¬ 
ten  fly  in  the  face  of  superficial  resemblance  in 
the  appearance  of  objects.  As  responsible  mem¬ 
bers  of  our  linguistic  and  cultural  communities 
we  feel  bound  to  defer  to  experts  in  the  correct 
application  of  these  terms,  at  least  in  discourse 
contexts  where  “correct”  classification  matters. 
This  “linguistic  division  of  labour”  has  been 
noted  among  others  by  Putnam  (1975).  Medin 
and  Ortony  (1989)  describe  the  same  situation 
using  the  notion  of  “psychological  essential- 
ism”  —  the  common  belief  that  many  natural 
kinds  have  an  essence  by  which  they  can  be 
correctly  classified,  even  though  that  essence 
and  how  to  detect  it  may  be  unknown  to  the  lay 
person.  My  point  is  that  although  we  may  defer 
to  experts  and  correct  definitions  when  the  con¬ 
text  requires,  we  are  also  very  willing  to  fall 
back  on  a  similarity-based  concept  of  many 
natural  kind  terms  for  other  purposes.  Studies 
of  natural  kinds  (e.g.  Hampton,  1995;  Kalish, 
1995;  Malt,  1994)  have  shown  that  people  are 
equally  happy  to  think  of  natural  kind  catego¬ 
ries  as  showing  family  resemblance  structure, 
and  of  categorization  in  such  categories  as  al¬ 
lowing  for  degrees  of  membership  depending 
on  similarity  to  known  typical  examples. 
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ABSTRACT 

Structure-mapping  is  fast  emerging  as  a  uni¬ 
fying  principle  for  a  variety  of  different  phenom¬ 
enon;  including  analogy,  metaphor,  similarity 
and  conceptual  combination.  In  this  paper,  we 
argue  that  it  is  inappropriate  to  extend  this  idea 
to  conceptual  combination,  as  has  been  done  in 
the  dual-process  theory  (see  Wisniewski,  1997a, 
1997b).  There  are  theoretical  and  empirical 
grounds  for  taking  up  this  position.  We  propose 
an  alternative  account  based  on  the  constraint 
j  theory  of  combination,  which  sees  the  interpre- 

I  tation  of  concept  combinations  as  one  of  satis- 

\  fying  multiple  constraints  of  diagnosticity,  plau- 

;  sibility  and  informativeness.  This  theory,  which 
[  we  would  like  to  advertise  as  being  the  truth, 
f  does  not  use  structure-mapping. 

u 

I 

I  INTRODUCTION 

I 

Structure-mapping  or  structural  alignment  is 
fast  emerging  as  a  unifying  principle  for  a  variety 
of  different  phenomena:  including  analogy  (e.g.. 
Centner,  1983;Holyoak&Thagard,  1995;Keane, 
1988;  Keane,  Ledgeway  &  Duff,  1994),  metaphor 
(e.g.,  Centner,  1982;  Centner  &  Wolff,  in  press; 
Veale  &  Keane,  1994,  1997),  similarity  (e.g., 
Markman  &  Centner,  1993a,  1993b;  Markman 
&  Wisniewski,  1997;  Coldstone,  1994;  Coldstone 
&  Medin,  1994)  and  conceptual  combination 
(Wisniewski,  1996, 1997a,  1997b;  Wisniewski  & 
Markman,  1993).  In  this  paper,  we  argue  that  it  is 
inappropriate  to  extend  this  structure-mapping 
account  to  conceptual  combination;  that  is,  to  the 
piocess  that  enables  people  to  interpret  novel  com¬ 
binations  like  horse  bird,  river  chair  and  so  on. 


Structural  alignment  is  a  process  that  match¬ 
es  the  relational  structure  of  two  domains  of 
knowledge  (e.g.,  concepts  or  stories)  in  accor¬ 
dance  with  the  systematicity  principle  (Centner, 
1983).  This  idea  has  been  instantiated  quite  pre¬ 
cisely  in  a  number  of  computational  models  in¬ 
cluding  the  Structure  Mapping  Engine  (Falken- 
hainer,  Forbus  &  Centner,  1986,  1989;  Forbus 
&  Oblinger,  1990;  Forbus,  Ferguson  &  Centner, 
1994),  the  Incremental  Analogy  Machine 
(Keane,  1990, 1997;  Keane  &  Brayshaw,  1988; 
Keane  et  al.,  1994),  ACME  (Holyoak  &  Thagard, 
1989)  and  LISA  (Hummel  &  Holyoak,  1997). 
In  the  context  of  conceptual  combination,  struc¬ 
tural  alignment  is  used  to  explain  the  generation 
of  certain  classes  of  interpretation  that  are  pro¬ 
duced  to  novel  compounds. 

In  the  remainder  of  this  paper  we  argue 
against  this  proposal' .  First,  we  describe  the 
dual-process  theory  in  some  detail.  Second,  we 
object  to  this  account  with  an  alternative  ac¬ 
count  called  the  constraint  theory.  Third,  we 
outline  evidence  favouring  the  constraint  theo¬ 
ry  over  dual-process  theory. 

DUAL-PROCESS  THEORY 

Dual-process  theory  (Wisniewski,  1997a, 
1997b)  proposes  that  two  main  mechanisms 
underlie  conceptual  combination:  structural 
alignment  and  scenario  formation.  Each  of  these 


‘  There  are  circumstances  under  which  analogy  is  cer¬ 
tainly  used  to  interpret  combinations;  for  instance,  it  is  hard 
to  explain  how  Irangate  could  be  interpreted  without  using 
Watergate  by  analogy  (see  Shoben,  1989).  However,  this 
type  of  interpretation  is  uncommon  and  cannot  account  for 
most  of  the  interpretations  normally  produced. 
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processes  is  responsible  for  explaining  the  dif¬ 
ferent  types  of  interpretation  that  people  pro¬ 
duce.  Structural  alignment  is  proposed  to  ex¬ 
plain  property  interpretations  where  an  proper¬ 
ty  from  one  concept  is  asserted  of  the  other  (e.g., 
an  elephant  fish  is  a  big  fish).  It  al.so  accounts 
for  hybrids  where  the  interpretation  is  some 
combination  of  the  properties  of  both  concepts; 
e.g,,  a  drill  screwdriver  is  two-in-onc  tool  with 
features  of  both  a  drill  and  a  screwdriver.  Sce¬ 
nario  formation  is  very  like  Murphy’s  (1988; 
Cohen  &  Murphy,  1984)  concept-special i.sation 
mechanism  and  is  UvSed  to  explain  relational 
interpretations  (e.g,,  a  night  flight  is  a  flight  tak¬ 
en  at  night).  We  will  concentrate  on  the  struc¬ 
tural  alignment  mechanism  here  as  it  is  our  main 
concern. 

The  structural  alignment  process  is  similar 
to  analogical  structure-mapping  (Centner, 
1983;  see  Keane,  1993,  for  a  review).  To  inter¬ 
pret  a  given  compound  phrase  the  structural 
alignment  process  compares  the  two  constitu¬ 
ent  concepts,  and  on  the  basis  of  that  compari¬ 
son  selects  an  alignable  difference  to  transfer 
from  one  conceptto  the  other.  When  two  con¬ 
cepts  are  compared  a  number  of  different  rela¬ 
tionships  can  be  found  between  their  parts: 
commonalities  (where  both  slot  and  value 
match),  alignable  differences  (where  both  slot 
and  value  match)  and  non-alignable  differenc¬ 
es  (where  both  slot  and  value  match;  sec  Fig¬ 
ure  1).  It  is  the  values  that  are  found  in  align¬ 
able  differences  that  are  used  in  property  inter¬ 


pretations;  for  example,  “an  elephant  fish  is  a 
big  fish”  is  produced  by  comparing  the  con¬ 
cepts  “elephant”  and  “fish”  noticing  that  the 
“elephant  “and  “fish”  share  the  dimension  si7.r 
but  have  different  values  on  that  dimension,  and 
transferring  the  alignable  difference  BIG  from 
“elephant”  to  “fish”.  When  a  single  aligned 
property  is  selected  for  transfer,  a  property  in¬ 
terpretation  is  produced;  if  multiple  properties 
are  transferred,  a  hybrid  interpretation  results. 
The  diagnosticity  of  a  property  may  have  a  role 
in  choosing  between  competing  alignable  dif¬ 
ferences,  if  more  than  one  is  available  (Wis¬ 
niewski,  1997a).  One  important  prediction 
made  from  this  alignment  mechanism  is  that 
property  interpretations  should  increase  in  fre¬ 
quency  when  the  constituent  concepts  of  a  com¬ 
bination  are  similar  (and  hence,  easy  to  align). 
This  prediction  has  been  confirmed  in  several 
studies  (Wisniewski  &  Markman,  1993;  Mark- 
man  &  Wisniewski,  1997).  Dual-process  theory 
is  a  well-developed  account  that  makes  several 
novel  and  interesting  predictions  about  concep¬ 
tual  combinations,  many  of  which  have  been  con¬ 
firmed  empirically.  However,  we  believe  that 
structural  alignment  is  not  used  in  conceptual  com¬ 
bination  but  have,  in  its  stead  advanced  a  theory, 
that  can  generate  property,  hybrid  and  relational 
interpretations  using  a  very  different  set  of  mech¬ 
anisms  guided  by  certain  high-level  constraints. 
In  the  next  section,  we  briefly  describe  this  theo¬ 
ry  before  describing  some  evidence  that  supports 
it  but  does  not  favour  structural  alignment. 
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Figure  J.  The  Different  Relationships  that  Occur  When  Two  Concepts  Are  Atignrd 
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THE  CONSTRAINT  THEORY 

Constraint  theory  (Costello,  1996;  Costello 
&  Keane,  1997a,  1997b,  1998)  describes  con¬ 
ceptual  combination  as  a  process  which  con¬ 
structs  representations  that  satisfy  three  con¬ 
straints  of  diagnosticity,  plausibility  and  infor¬ 
mativeness.  These  constraints  derive  from  the 
pragmatics  of  compound  interpretation  and  use 
(Grice,  1975;  see  Costello  &  Keane,  1997b,  for 
details).  In  this  section  we  describe  the  three 
constraints  which  the  theory  proposes;  the  spe¬ 
cific  algorithm  for  building  representations  that 
satisfy  these  constraints  is  described  elsewhere 
(see  Costello,  1996;  Costello  &  Keane,  1997b). 

The  diagnosticity  constraint  requires  the 
construction  of  an  interpretation  containing  di¬ 
agnostic  properties  from  each  of  the  concepts 
being  combined.  The  diagnostic  properties  of 
a  concept  are  those  which  occur  often  in  in¬ 
stances  of  that  concept  and  rarely  in  instances 
of  other  concepts  (similar  to  Rosch’s,  1978,  cue 


validity).  Diagnosticity  predicts  that  the  inter¬ 
pretation  “a  cactus  fish  is  a  prickly  fish”  is  pref¬ 
erable  to  “a  cactus  fish  is  a  green  fish”  because 
PRICKLY  is  niore  diagnostic  of  cactus  than 
GREEN.  Diagnosticity  also  identifies  the  fo¬ 
cal  concept  or  central  concept  which  an  inter¬ 
pretation  is  about;  the  focal  concept  of  an  in¬ 
terpretation  is  defined  to  be  that  part  of  the  in¬ 
terpretation  which  possesses  the  diagnostic 
properties  of  the  head  noun  of  the  phrase  being 
interpreted. 

The  plausibility  constraint  requires  the  con¬ 
struction  of  an  interpretation  containing  seman¬ 
tic  elements  which  are  already  known  to  co¬ 
occur  on  the  basis  of  past  experience.  The  plau¬ 
sibility  constraint  ensures  that  interpretations 
describe  an  object  (or  collection  of  objects) 
which  could  plausibly  exist.  Plausibility  would 
predict  that  the  interpretation  “an  angel  pig  is  a 
pig  with  wings  on  its  torso”  would  be  prefera¬ 
ble  to  “an  angel  pig  is  a  pig  with  wings  on  its 
tail”,  because  prior  experience  suggests  that 


alignable  non-alignable 


Good 


Bad 


Figure  2.  Mean  Goodness  Ratings  for  Different  property  Interpretations 
from  Costello  &  Keane  (1998). 
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wings  are  typically  attached  to  the  centre  of 
gravity  of  an  object  (sec  also  Downing,  1977). 

The  informativeness  cons  fra  inf  Tcqu\rc^  the 
construction  of  an  inteipretation  which  conveys 
a  requisite  amount  of  new  information.  Infor¬ 
mativeness  excludes  feasible  interpretations 
that  do  not  communicate  anything  new  relative 
to  either  constituent  concept;  for  example,  *‘a 
pencil  bed  is  a  bed  made  of  wood”  is  a  feasible 
inteipretation  for  “pencil  bed”  but  no  one  pre¬ 
sented  with  this  compound  has  ever  produced 
it  as  an  interpretation  (see  Costello  &  Keane, 
1997a).  Together  these  three  constraints  ac¬ 
count  for  the  range  of  different  combination 
types  that  have  been  observed:  each  combina¬ 
tion  type  represents  a  different  way  of  satisfy¬ 
ing  the  constraints. 

Empirical  support  for  constraint  theory 
comes  from  analyses  of  the  rates  of  different  in- 
teipretation-types  produced  to  combinations  in¬ 
volving  constituents  of  different  classes  (e.g.,  ar¬ 
tifacts,  natural  kinds,  superordinates  and  basic- 
level  concepts;  see  Costello  &  Keane,  1997a, 
1997b).  However,  the  theory  also  makes  a  novel 
prediction  on  the  frequency  of  property  inter¬ 
pretations  in  so-called  called  reversed-focal  in- 
teipretations.  In  reversedfoca!  interpretations, 
the  focal  concept  is  the  modifier  concept  (i.e., 
the  first  word)  rather  than  the  head  (i.e.,  the  sec¬ 
ond  word);  for  instance,  **achairladderh  a  chair 
that  is  by  necessity  used  as  a  ladder”  (see  also 
Gerrig  &  Murphy,  1992;  Wisniewski  &  Cent¬ 
ner,  1991).  In  constraint  theory,  the  referent  of 
an  interpretation  is  identified  by  the  diagnostic 
properties  of  the  head  concept  of  the  phrase  in¬ 
terpreted.  Therefore,  the  theory  predicts  that  re¬ 
versed-focal  interpretations  should  involve  the 
diagnostic  properties  of  the  head  being  mapped 
to  the  modifier.  In  short,  that  reversed-referent 
interpretations  will  be  property  interpretations. 
Costello  &  Keane  (1997a)  have  found  that  this 
is  indeed  the  case,  that  while  property  interpre¬ 
tations  were  in  general  less  frequent  (around 
30%)  than  relational  interpretations  (around 
50%),  for  reversed-focal  interpretations  this  pat¬ 
tern  was  reversed:  around  50%  of  reversed-fo- 
cals  were  property-mappings,  with  around  30% 
relational  interpretations. 


The  constraint  theory  has  also  been  imple¬ 
mented  in  a  running  computational  model  that 
has  been  tested  on  a  large  number  of  combina¬ 
tions;  the  C^  model  (Constraints  on  Conceptu¬ 
al  Combination). 

EVIDENCE  AGAINST  ALIGNMENT 

Both  the  constraint  theory  and  alignment 
theory  speak  to  a  common  corpus  of  empirical 
evidence  and  each  make  their  own  predictions 
about  certain  novel  phenomena.  The  difficulty 
is  in  finding  evidence  that  decides  between  the 
two  theories. 

We  know  of  only  one  piece  of  evidence  that 
appears  to  present  some  difficulties  for  the  pre¬ 
dictions  of  the  dual-process  theory;  namely, 
Costello  &  Keane’s  (1998)  study  of  people’s 
judgement  of  property  interpretations  involv¬ 
ing  properties  that  were  systematically  varied 
in  terms  of  their  alignability  and  diagnosticity. 
This  experiment  made  use  of  a  goodness  judge¬ 
ment  task  for  a  set  of  property  interpretations 
to  noun-noun  compounds.  In  the  main  experi¬ 
ment,  participants  were  given  four  different 
property  interpretations  of  a  novel  combination, 
each  of  which  reflected  one  of  the  logical  pos¬ 
sibilities  involving  the  two  variables.  For  the 
novel  combination  “bumblebee  moth’’,  forex- 
ample,  participants  received  the  following  four 
possible  interpretations: 

Bumblebee  moths  are 

(a)  moths  that  are  black  and  yellow 
(aligned  diagnostic) 

(b)  moths  that  are  the  size  of  a  bumblebee 
(aligned  non-diagnostic) 

(c)  moths  that  sting 
(non-aligned  diagnostic) 

(d)  moths  that  fertilise  plants 
(non-aligned  non-diagnostic) 

Participants  then  rated  the  goodness  of 
these  meanings  for  the  combination,  using  a 
seven-point  scale  (from  -3  to  +3).  The  inter¬ 
pretations  used  in  this  main  experiment  were 
constructed  based  on  analyses  from  two  pre- 
te.st  experiments.  In  Pre-test  1,  alignable  and 
non-alignable  differences  for  the  concepts  in 
each  noun-noun  phrase  W'ere  gathered  (using 
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Markman  &  Wisniewski’s,  1997,  methodolo¬ 
gy).  In  Pre-test  2,  the  diagnosticity  of  these  Se¬ 
lected  alignable  and  non-alignable  properties 
were  determined  in  a  rating  study. 

The  results  showed  that  people  prefer  prop¬ 
erty  interpretations  using  non-alignable  prop¬ 
erties  (if  they  are  diagnostic)  to  alignable  dif¬ 
ferences  (if  they  are  not  diagnostic;  see  Figure 
2).  Notably,  this  experiment  clearly  shows  that 
diagnostic,  non-alignable  properties  support 
good  property  interpretations.  This  alignment 
account  predicts  that  alignable  properties  will 
always  be  preferred. 

CONCLUSIONS 

In  this  paper,  we  have  argued  that  structure 
mapping  does  not  be  extend  to  an  account  of 
conceptual  combination  but  that  other  mecha¬ 
nisms  provide  a  better  account.  It  could  be  ar¬ 
gued  that  certain  parts  of  constraint  theory 
might  be  handled  by  an  alignment  mechanism 
(e.g.,  the  plausibility  constraint  is  a  likely  can¬ 
didate).  We  would  resist  such  a  proposal,  if  only 
to  clarify  the  different  sides  in  the  debate.  But, 
there  are  broader  reasons  for  preferring  a  pure 
constraint  account.  That  is,  it  seems  to  us  that 
the  pragmatics  of  understanding  conceptual 
combinations  are  quite  different  to  that  which 
hold  in  analogy,  and  that ,  as  such,  there  should 
be  no  reasonable  expectation  for  analogical  pro¬ 
cesses  to  play  a  role. 
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ABSTRACT 

Early  research  on  the  ability  of  chimpan¬ 
zees  to  complete  analogies  provided  evidence 
of  that  ability  with  regard  to  relationships  be¬ 
tween  physical  properties  of  geometric  forms 
and  with  regard  to  functional  relationships  be¬ 
tween  common  objects.  Recent  research,  requir¬ 
ing  not  only  completion  of  partially-construct¬ 
ed  analogies,  but  also  construction  of  an  analo¬ 
gy  from  its  elements,  provided  evidence  for 
both  abilities  in  the  chimpanzee.  However,  the 
data  suggest  that  the  strategies  used  by  the  chim¬ 
panzee  to  solve  such  problems  may  be  analo¬ 
gous,  but  not  identical,  to  those  used  by  humans. 

Classical  analogy  problems  involve  per¬ 
ceptions  and  judgments  about  relations  be¬ 
tween  relations  Typically,  the  ability  to  solve 
such  problems  is  regarded  as  a  measure  of 
computationally  complex,  reasoning  at  a  de- 
velopmentally  sophisticated  level  (e.  g.,  Gos- 
wami,  1991 ;  Holyoak  &  Thagard,  1997;  Piag¬ 
et,  1977;  Sternberg,  1977,  1982;  Sternberg  & 
Nigro,  1980;  Vosniadou  &  Ortony,  1989).  The 
question  of  whether  such  sophisticated  reason¬ 
ing  is  unique  to  humans  has  been  a  perennial 
topic  for  debate  (cf.,  Darwin,  1871;  Griffin, 
1992;  James,  198f/1890;  Vauclair,  1996; 
Weiskrantz,  1985),  There  are  techniques 
which  allow  one  to  systematic  examine  ana¬ 
logical  reasoning  and  its  component  process¬ 


es  in  species  other  than  humans  even  though 
they  lack  the  capacity  for  verbal  report.  For 
example,  Gillan,  Prcmack  and  Woodruff 
(1981)  reported  that  a  chimpanzee,  Sarah, 
solved  analogies  which  were  in.siantiatcd  us¬ 
ing  simple  geometric  forms  presented  in  a  2  x 
2  matrix  format  as  shown  in  Figure  1.  Here 
the  stimuli  A  and  A’  exemplified  a  certain  re¬ 
lation,  (i.e.,  large  vs.  small),  the  stimuli  B  and 
B’  exemplified  the  same  relation  but  with  dif¬ 
ferent  items  (i.e.,  squares  rather  than  circles), 
and  “same”  was  the  plastic  token  for  this  con¬ 
cept  from  the  chimpanzee’s  artificial  language 
(Prcmack,  1976).  Thus,  the  array  shown  in  fig¬ 
ure  2  represents  an  analogy  that  a  human  might 
verbalize  as,  “large  circle  is  to  small  circle  as 
large  square  is  to  small  square.” 

In  one  set  of  experiments,  Gillan  et  al 
(1981)  presented  the  chimpanzee  Sarah  with 
four  items  presented  in  the  2  x  2  format.  If  the 
arrangement  constituted  a  true  analogy  then 
Sarah’s  task  was  to  place  her  token  for  “same” 
in  the  center  of  the  analogy  matrix  between  the 
two  arguments  of  that  analogy.  If  the  arrange¬ 
ment  of  items  did  not  constitute  an  analogy  then 
Sarah  was  correct  if  she  placed  her  token  for 
“different”  in  the  center  of  the  matrix.  In  an¬ 
other  set  of  analogy  problems  Gillan  et  al  ( 1 98 1 ) 
presented  Sarah  with  three  terms  of  an  analogy 
(i.e.,  A,  A’  and  B  )  which  were  positioned  ac¬ 
cording  to  the  format  described  above.  Sarah’s 
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B 

"Same" 

B 

B' 

Figure  L  The  2x2  matrix  format  used  by  Gillan  et  al. 

(1981), 

task  was  to  select  the  appropriate  fourth  term 
(B’)  that  was  presented  with  another,  but  inap¬ 
propriate,  alternative. 

In  addition  to  solving  these  analogy  prob¬ 
lems  involving  arbitrary  relations  between  geo¬ 
metric  forms,  Sarah  also  solved  analogy  prob¬ 
lems  (Gillan  et  al,  1981;  Exp.  3)  in  which  the 
common  objects  were  used  as  the  elements  from 
which  analogies  could  be  constructed  based  on 
functional  relations  (e.g.,  padlock  is  to  key  as  tin 
can  is  to  can  opener).  In  these  functional  analo¬ 
gy  problems,  the  objects  used  to  construct  them 
were  presented  in  the  same  matrix  format  as  were 
the  geometric  problems.  In  both  geometric  and 
functional  analogies,  Sarah’s  task  was  essentially 
the  same:  To  complete  (or  evaluate)  a  2  X  2  ar¬ 
rangement  of  objects  in  which  the  relationship 
between  the  items  in  the  left  column  was  equiv¬ 
alent  to  the  relationship  between  the  items  in  the 
right  column. 

Gillan  et  al  (1981)  interpreted  Sarah’s  suc¬ 
cessful  performance  on  both  geometric  and 
functional  analogy  problems  as  reflecting  her 
ability  to  reason  about  relations  between  rela¬ 
tions.  That  is,  she  presumably  established  the 
relationship  “same”  (or  “different”)  between 
the  two  sides  of  the  analogy  by  first  assessing 
and  then  comparing  the  relationships  within 
each  side.  However,  a  close  examination  of  the 
choices  made  by  Sarah  suggests  that  at  least 
some  of  her  apparently  analogical  based  per¬ 
formances  could  have  reflected  far  less  sophis¬ 
ticated  strategies. 

Consider,  for  example,  those  problems 
which  required  Sarah  to  select  a  fourth  item  to 


Figure  2,  A  geometric  analogy  in  the  2x2  matrix 
format. 


complete  a  partially-constructed  analogy.  Sue 
Savage-Rumbaugh  (personal  communication, 
1989)  challenged  the  claim  that  Sarah  employed 
true  analogical  reasoning  to  solve  such  prob¬ 
lems.  Specifically,  Savage-Rumbaugh  provid¬ 
ed  a  detailed  analysis  of  Sarah’s  performance 
which  indicated  that  Sarah  need  not  attend  to 
the  relationship  instantiated  by  the  A  and  A’ 
elements  on  the  left-hand  side  of  the  matrix. 
Savage-Rumbaugh  showed  how  Sarah’s  choic¬ 
es  could  have  been  determined  solely  by  a  hi¬ 
erarchical  set  of  featural  matching  rules  by 
which  she  identified  the  choice  item  most  like, 
if  not  identical  to  the  single  item  (i.e.,  B)  on  the 
right-hand  side  of  the  matrix.  Savage-Rum¬ 
baugh’ s  analysis  was  compelling  because  it  not 
only  predicted  the  chimpanzee’s  correct  choic¬ 
es,  but  also  her  errors.  Furthermore,  studies  of 
analogical  reasoning  in  4-  and  5-year  old  chil¬ 
dren  (Alexander  et  al.,  1989;  Goswami,  1989) 
revealed  that  the  less-proficient  reasoners  fre¬ 
quently  resorted  to  such  strategies. 

Although  Savage-Rumbaugh’s  featural 
similarity  matching  analysis  has  some  heuris¬ 
tic  value  for  explaining  some  of  the  Gillan  et  al 
(1981)  results,  it  cannot  account  for  Sarah’s 
performance  in  other  experiments  in  the  same 
study  which  were  designed  explicitly  to  rule  out 
physical  matching  or  other  associative  process¬ 
es  for  problem  solving.  Nevertheless,  Savage- 
Rumbaugh’s  analysis  is  important  because  it 
raises  fundamental  questions  regarding  the  con¬ 
ditions  necessary  for  the  expression  of  analog¬ 
ical  reasoning  abilities  (cf.,  Oden,  Thompson 
&  Premack,  1990).  For  example,  Sarah’s  ana- 
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logical  reasoning  ability  may  only  have  been 
expressed  in  situations  where  it  was  mandated 
by  the  structure  of  the  task.  Consider,  for  ex¬ 
ample,  the  case  of  functional  analogies.  Faced 
with  the  question,  "Padlock  is  to  key  as  tin  can 
is  to...?"  Sarah  could  not  have  chosen  a  can- 
opener  instead  of  a  paintbrush  other  than  by 
comparing  functional  relationships  .The  utility 
of  associative  strategics  in  this  task  was  pre¬ 
cluded  by  the  experimental  design. 

Recent  advances  in  the  study  of  analogies 
by  a  chimpanzee. 

We  present  here  a  summary  of  extensive  data 
analyses  of  more  recent  research  conducted  with 
Sarah  on  analogical  problem  solving  ta.sks  (Oden, 
Thompson  &  Premack,  in  preparation  a;  Oden, 
Thompson  &  Premack,  in  preparation  b).  These 
experiments  were  conducted  in  part  to  determine 
the  boundary  conditions  for  Sarah’s  analogical 
reasoning.  For  example,  would  Sarah  use  ana¬ 
logical  reasoning  spontaneously  in  situations 
where  a  simpler  associative  strategy  would  suf¬ 
fice?  If  so,  then  one  could  argue  that  she  is  pre¬ 
disposed,  as  are  we  humans,  to  rea.son  about  re¬ 
lations  between  relations;  seeking  out  metaphor 
even  when  it  is  not  explicitly  required.  Another 
goal  of  this  research  then  was  to  determine 
whether  Sarah  could  also  construct,  rather  than 
merely  complete,  analogies.  This  task  is  substan¬ 
tially  more  demanding  than  those  she  faced  in 
her  earlier  work.  Completing  or  evaluating  anal¬ 
ogies  requires  one  to  compare  relations  which 
have  been  previously  established;  con.structing 
analogies,  however,  requires  one  to  seek  out  re¬ 
lations  which  reside  among  stimuli,  but  which 
have  yet  to  be  specified. 

The  materials  used  in  this  series  of  analo¬ 
gy  tasks  were  similar  to  those  used  in  the  Gillan, 
et  al  (1981)  geometric  analogy  problems.  Sa¬ 
rah  worked  with  an  analogy  board;  a  blue  card¬ 
board  rectangle  with  an  attached  white  card¬ 
board  cross,  the  arms  of  which  extended  across 
the  length  and  width  of  the  rectangle.  This  pro¬ 
vided,  at  each  comer  of  the  rectangle,  a  recess 
into  which  stimuli  could  be  placed  to  constmet 
an  analogy.  Sarah’s  plastic  token  for  the  con¬ 
cept  "same."  was  placed  at  the  intersection  of 
the  display  board’s  arms. 


The  experimental  stimuli  were  squares  of 
white  cardboard,  each  with  a  geometric  form 
stenciled  on  it.  The  forms  varied  in  color  (4), 
shape  (3),  size  (2),  and  whether  they  were  filled 
in  with  color  or  simply  a  colored  outline.  Ail 
possible  combinations  of  these  properties  were 
used  to  create  a  pool  of  48  different  items  which 
were  used  in  the  experiments  reported  here. 

The  following  rules  were  used  to  select 
items  for  the  analogies.  A  and  A’  differed  with 
respect  to  a  single  dimension  (size,  color,  shape 
or  fill).  B  and  B’  also  differed  in  this  single 
dimension.  A  differed  from  B  (and  thus  A’  dif¬ 
fered  from  B’)  on  two  dimensions,  each  differ¬ 
ent  from  the  property  distinguishing  A  and  A’. 
For  example,  if  A*  represented  a  size  transfor¬ 
mation  of  A,  then  B  might  differ  from  A  with 
respect  to  color  and  shape  or  shape  and  fill. 
Following  these  rules,  a  total  of  612  unique 
combinations  of  4  stimuli  could  be  selected 
which,  when  appropriately  placed  on  the  board, 
would  create  an  analogy.  When  experimental 
conditions  required  presentation  of  an  addition¬ 
al  (error)  alternative  choice  item,  this  item  (C) 
differed  from  B'  along  the  dimension  which 
was  not  used  in  constructing  the  analogy.  For 
example,  if  the  analogy  was  a  "size  x 
shapc+fill",  then  C  differed  from  B'  in  color. 

Sarah  worked  with  these  materials  under  four 
conditions.  In  two  conditions,  she  was  required 
to  complete  partially-constructed  analogies  which 
were  presented  on  the  analogy  board.  In  two  oth¬ 
er  conditions,  she  was  presented  with  an  empty 
analogy  board  along  with  the  appropriate  stimu¬ 
lus  items  and  had  to  construct  an  analogy  from 
scratch.  Throughout  the  study,  a  unique  set  of  4 
analogy  items  was  used  on  each  trial. 

General  test  procedures.  A  standard  test 
procedure  was  used  in  all  conditions.  On  each 
trial  of  a  test  session,  the  trainer  placed  the  anal¬ 
ogy  board  just  inside  the  wire  mesh  of  Sarah’s 
home  cage  enclosure.  The  board  contained  ei¬ 
ther  a  parti  ally-constructed  analogy  (Comple¬ 
tion  Conditions  1  &  2)  or  no  stimuli  at  all  (Con¬ 
struction  Conditions  3  &  4).  The  stimuli  which 
served  as  ‘answer*  alternatives  were  contained 
in  a  covered  cardboard  box  which  the  trainer 
placed  in  front  of  the  analogy  board.  After  pre- 
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senting  the  materials,  the  trainer  left  the  room 
and  recorded  Sarah’s  behavior  via  a  one-way 
mirror.  Sarah’s  task  was  to  open  the  alterna¬ 
tives  box,  make  her  selections  and  place  the 
items  in  the  empty  recesses  of  the  analogy 
board.  Any  unused  items  were  either  left  in  the 
box  or,  at  Sarah’s  discretion,  placed  in  a  pie  tin 
adjacent  to  the  testing  area.  She  then  rang  a 
small  bell  inside  her  enclosure,  summoning  the 
trainer  back  into  the  room. 

In  those  sessions  where  the  design  called 
for  differential  feedback  (Completion  Condi¬ 
tion  1 ),  Sarah  was  praised  and  given  a  piece  of 
fruit  after  each  trial  when  she  had  completed 
an  analogy.  When  she  erred,  she  was  mildly 
admonished  and  the  trainer  demonstrated  the 
proper  arrangement  of  stimuli  but  gave  no  food 
reward.  In  those  sessions  which  called  for  non¬ 
differential  feedback  (Completion  Condition  2; 
Construction  Conditions  3  &  4),  Sarah  was 
praised  and  given  a  food  reward  for  every  trial 
regardless  of  her  accuracy,  unless  she  had  left 
an  unfilled  space  on  her  analogy  board.  In  that 
case,  the  trainer  pointed  to  the  empty  recess  and 
instructed  Sarah  to  “Do  better  next  time.”  No 
other  feedback  was  given  on  such  trials.  Under 
non-differential  feedback,  no  particular  prob¬ 
lem-solving  strategy  is  explicitly  required,  al¬ 
lowing  the  chimpanzee,  if  she  is  so  inclined,  to 
demonstrate  spontaneous  analogical  reasoning 
(cf.,  Oden,  Thompson  &  Premack,  1988). 

DOES  A  CHIMPANZEE  COMPLETE 
ANALOGY  PROBLEMS 
ANALOGICALLY? 

Condition  1:  Completion  with  two  alter¬ 
natives.  This  condition  was  a  replication  of  the 
forced-choice  task  used  by  Gillan  et  al.  (1981), 
in  which  Sarah  was  required  to  select  a  single 
item  (B’)  to  complete  a  partially-constructed 
analogy.  This  condition  was  intended  to  famil¬ 
iarize  Sarah  with  the  new  analogy  board  and 
stimulus  items,  and  to  provide  a  performance 
baseline.  The  analogy  elements  A,  A’  and  B 
were  placed  in  their  appropriate  positions  on 
the  board  by  the  trainer.  Two  items,  B’  and  an 
error  alternative  (C),  were  placed  in  the  alter¬ 


natives  box.  One  session  of  twelve  trials  was 
run  using  differential  feedback. 

'  Three  of  the  12  trials  could  not  be  scored 
because  one  or  more  of  the  recesses  on  the  anal¬ 
ogy  board  were  empty  when  the  trainer  was 
summoned  by  Sarah’s  bell.  In  two  of  these  cas¬ 
es,  this  was  the  result  of  Sarah  having  disman¬ 
tled  the  partially-constructed  analogy  to  close¬ 
ly  inspect  the  new  stimulus  materials.  In  the 
third  case,  both  alternatives  were  laid  on  the 
floor  beside  the  intact  analogy  board.  Sarah 
succeeded  in  completing  the  analogy  on  8  of 
the  9  trials  which  could  be  scored.  This  level  of 
performance  (89%;  p<  .05,  Binomial  test)  com¬ 
pares  favorably  with  the  75%  overall  accuracy 
reported  in  the  original  analogy  studies  (Gillan, 
etal.,  1981). 

Condition  2  :  Completion  with  three  al¬ 
ternatives.  This  condition  was  run  to  determine 
whether  Sarah  could  not  only  select  items  nec¬ 
essary  to  complete  an  analogy,  but  also  posi¬ 
tion  them  on  the  board  so  that  the  final  product 
reflected  an  analogical  arrangement.  In  this 
condition,  the  trainer  placed  only  A  and  A’  on 
the  board.  B,  B’  and  C  were  placed  in  the  alter¬ 
natives  box.  Sarah’s  task  was  to  select  and  prop¬ 
erly  arrange  B  and  B’  on  the  board.  The  arrange¬ 
ment  of  the  items  in  the  alternatives  box  was 
random.  Four  sessions  of  twelve  trials  each  were 
run,  using  non-differential  feedback. 

Sarah  completed  an  analogy  on  22  of  48 
trials  (46%) ,  significantly  more  often  than  the 
16%  expected  by  chance.  She  selected  the  anal¬ 
ogy  pair  (B,  B’)  on  27  of  48  trials  (56%;  chance 
=  33%).  On  22  of  these  27  trials  (81%;  chance 
=  50  %)  the  selected  items  were  placed  on  the 
board  in  the  B/B’  arrangement  which  complet¬ 
ed  the  analogy  begun  with  A/A’. 

Sarah’s  overall  success  at  completing  anal¬ 
ogies  under  this  second  condition,  while  statis¬ 
tically  significant,  was  substantially  lower  than 
in  Condition  1.  Our  examination  of  her  rela¬ 
tive  success  on  the  two  components  of  this  task 
(item  selection  and  analogical  placement)  sug¬ 
gests  that,  for  Sarah,  the  first  component  was 
the  more  difficult  of  the  two.  That  is,  although 
she  selected  the  potential  analogy  choice  pair 
on  only  56%  of  the  trials,  once  this  pair  was 
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selected,  Sarah  arranged  them  analogically  81  % 
of  the  time.  Contrary  to  Savage-Rumbaugh’s 
analysis  of  Gillan  et  als.  (1981)  initial  results, 
the  present  data  strongly  suggest  that  Sarah’s 
performance  on  analogy  completion  tasks  was 
not  significantly  influenced  by  a  simple  match¬ 
ing  strategy  or  other  assessments  of  mere  fea- 
tural  similarity.  Rather,  Sarah’s  performance 
was  guided  by  the  relations  between  features 
in  the  A/A’  arrangement  presented  on  her  anal¬ 
ogy  board. 

Her  attention  to  relations  is  particularly 
striking  given  that  non-differential  reinforce¬ 
ment  was  used  in  Condition  2.  This  meant  that 
she  could  have  used  any  strategy  whatsoever 
(including  random  selection  and  placement)  to 
fill  the  analogy  board.  Nevertheless,  she  appears 
to  have  spontaneously  adopted  the  relations 
between  relations  strategy.  The  next  two  con¬ 
ditions  were  intended  to  determine  whether 
Sarah  could  detect  and  use  relations  to  con¬ 
struct  an  analogy  when  presented  with  the  nec¬ 
essary  elements  and  an  empty  analogy  board. 

Will  a  chimpanzee  construct  analogies 
spontaneously? 

Condition  3:  Construction  with  four  al¬ 
ternatives.  In  this  condition,  Sarah  was  pre¬ 
sented  with  a  completely  empty  analogy 
board  and  her  alternatives  box  containing  the 
four  items  necessary  to  construct  an  analo¬ 
gy.  When  Sarah  placed  the  items  in  the  re¬ 
cesses  of  her  analogy  board,  non-differential 
reinforcement  was  given,  regardless  of 
whether  their  arrangement  constituted  a  val¬ 
id  analogy.  The  criterion  used  for  scoring  her 
constructions  was  as  follows.  Sarah  did  not 
have  to  place  the  stimulus  items  originally 
designated  by  the  investigators  as  A,  A’,  B, 
B’  in  any  particular  recess.  Any  arrangement 
using  these  four  elements  was  accepted  as  an 
analogy  if  A  and  B  appeared  together  on  one 
axis  (row  or  column)  of  the  board,  and  where 
A  and  A*  appeared  together  on  the  alterna¬ 
tive  axis  (column  or  row).  This  scoring  rule 
was  based  on  the  property  of  an  analogy  that 
its  elements  and  arguments  may  be  inter¬ 
changed  in  certain  ways  and  still  maintain  an¬ 
alogical  relations.  For  example,  the  construc¬ 


tion  "dog:cat::puppy:kitten”  is  as  valid  as 
“cat:kitten::dog:puppy”  even  though  the  re¬ 
lations  expressed  are  rearranged.  However, 
‘*cat:puppy::kitten:dog”  would  not  be  accept¬ 
ed  as  a  valid  analogy. 

There  were  24  possible  arrangements  of  the 
items  for  a  given  trial,  8  of  which  (33%)  would 
qualify  as  analogies.  Sarah  constructed  valid 
analogies  on  28  of  45  trials  (62%),  significant¬ 
ly  more  often  than  expected  by  chance.  These 
results  provide  good  evidence  that  Sarah  con¬ 
structed  classical  analogies  using  the  same  cri¬ 
teria  as  a  human. 

Did  Sarah  additionally  understand  the  na¬ 
ture  of  the  task  before  her?  That  is,  did  she 
intend  to  construct  an  analogy  when  she  be¬ 
gan  a  trial  or  did  analogies  unintentionally 
unfold  as  a  necessary  consequence  of  her  ini¬ 
tial  choices?  The  answer  to  this  question  lays 
in  the  nature  of  her  first  two  choices  and  their 
placement  on  the  board.  On  approximately 
90%  of  the  trials,  Sarah  placed  her  first  two 
choices  in  the  same  row  or  column  on  her 
analogy  board,  thereby  determining  whether 
an  analogy  could  be  completed. 

With  4  alternatives,  there  were  12  possi¬ 
ble  ways  that  the  first  2  items  could  be  cho¬ 
sen.  Eight  of  these  combinations,  when 
placed  in  the  same  row  or  column  of  the  anal¬ 
ogy  board,  constituted  a  ’‘potential  analogy” 
(i.e.,  they  could  become  part  of  a  valid  anal¬ 
ogy  if  the  remaining  items  were  arranged 
properly).  Thus,  Sarah  could  create,  random¬ 
ly,  a  potential  analogy  67%  of  the  time.  But, 
in  fact,  her  first  two  choices  and  placements 
produced  potential  analogies  82%  (37/45  tri¬ 
als)  of  the  time.  Thus,  we  have  evidence  that 
Sarah  exercised  what  might  be  called  “fore¬ 
sight”  in  constructing  her  analogies.  She  es¬ 
sentially  created  the  initial  conditions  that 
had  been  previously  provided  by  the  experi¬ 
menters  in  Condition  2  of  the  completion 
task.  On  76%  (28/37)  of  these  trials  Sarah 
successfully  completed  the  construction  of  a 
valid  analogy.  This  level  of  success  is  con¬ 
sistent  with  her  prior  performances  on  the 
completion  tasks  reported  here  and  by  Gillan 
ct  al.(1981). 
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Condition  4:  Construction  with  Hve  al¬ 
ternatives.  This  condition  was  used  to  explore 
the  effect  of  requiring  an  additional  selection 
process  as  part  of  analogy  construction.  Re¬ 
call  that  in  Condition  3,  the  selection  process 
proved  to  be  more  fragile  than  the  arrangement 
process.  We  were  curious  whether  Sarah,  faced 
with  this  additional  complexity,  would  resort 
to  a  simpler  associative  strategy  or  perhaps 
abandon  all  strategies  in  favor  of  random  se¬ 
lection  and  placement.  In  this  condition,  Sa¬ 
rah  was  presented  with  an  empty  analogy 
board  and  her  box  of  alternatives  which  con¬ 
tained  four  elements  that  could  be  used  to  con¬ 
struct  an  analogy,  and  a  fifth,  unusable  item 
(C,  the  error  alternative).  As  in  Condition  3, 
Sarah’s  task  was  simply  to  fill  the  four  empty 
spaces  on  the  board  for  which  she  received 
non-differential  feedback. 

In  this  condition,  Sarah  constructed  analo¬ 
gies  on  21%  of  the  trials.  As  expected,  this  lev¬ 
el  of  performance  was  substantially  lower  than 
performance  in  the  three  preceding  conditions, 
but  it  was  nevertheless  still  significant  (p  <  .001 , 
binomial  test).  As  before,  we  examined  the  se¬ 
quence  of  Sarah’s  selections  and  placements 
to  determine  whether  her  performance  truly 
reflected  analogical  reasoning  or  if  it  was  the 
accidental  byproduct  of  some  simpler  strategy. 
Two  such  strategies  are  considered  below. 

Strategy  1:  Minimizing  Featural  differ¬ 
ences.  One  possible  strategy  is  that  Sarah  was 
guided  by  an  appreciation  of  a  more  global  pat¬ 
tern  of  relationships  within  an  analogy,  rather 
than  the  relations  between  particular  pairs  of 
items.  We  computed  the  total  number  of  fea¬ 
tural  differences  among  members  of  the  five  4- 
item  sets  which  could  be  drawn  from  the  larger 
5-item  set  presented  on  each  trial.  According 
to  the  rules  used  to  select  those  items,  the  sub¬ 
set  which  could  be  used  to  construct  an  analo¬ 
gy  (A,  A’,  B,  B’)  would  necessarily  involve  a 
minimum  number  of  featural  differences.  How¬ 
ever,  another  subset  (C,  A’,  B,  B’)  also  mini¬ 
mized  the  number  of  featural  differences  be¬ 
tween  its  members 

If  Sarah  were  following  a  strategy  of  “min¬ 
imize  featural  differences  on  the  board,”  this 


would  have  led  to  completion  of  analogies  in 
Condition  1 .  In  Condition  2,  this  strategy  would 
have  led  to  the  appropriate  selection,  but  not 
necessarily  to  the  appropriate  arrangement,  of 
items  needed  to  construct  an  analogy.  In  the 
present  condition,  this  strategy  would  have  led 
to  the  selection  of  the  potential  analogy  set.  But 
it  should  also  have  led  equally  often  to  the  se¬ 
lection  of  the  set  containing  item  C,  the  error 
alternative.  In  fact,  Sarah  selected  the  potential 
analogy  set  46%  of  the  time  and  selected  the 
other  “minimal-difference”  set  only  12%  of  the 
time  (chance  =  20%).  Thus,  Sarah  was  clearly 
not  trying  to  simply  maximize  overall  similar¬ 
ity  among  the  four  items  placed  on  the  board. 
It  would  be  tempting,  therefore,  to  conclude  that 
the  relationship  between  particular  items  (a  pre¬ 
requisite  of  analogical  reasoning)  was  of  sig¬ 
nificance  to  Sarah.  However,  an  alternative 
strategy  must  be  considered  before  accepting 
this  conclusion. 

Strategy  2;  Exclusion  of  C,  “the  Odd  man 

Out”.  It  could  be  that  Sarah  adopted  a  strategy 
of  excluding  alternative  C  which  possessed  a 
single  property  (size,  shape,  color  or  fill)  which 
was  not  shared  with  any  of  the  other  five-items. 
This  strategy  would  have  led  Sarah  to  select 
the  four  items  which  could  be  used  to  construct 
an  analogy,  but  only  if  they  were  arranged  ap¬ 
propriately  on  the  board.  Given  a  selection  of 
the  appropriate  items,  one-third  of  their  possi¬ 
ble  arrangements  would  meet  our  criteria,  de¬ 
scribed  previously,  for  an  analogy.  Using  this 
one  third  proportion  as  an  estimate  of  chance 
success,  Sarah’s  performance  was  not  statisti¬ 
cally  significant  suggesting,  therefore,  that  Sa¬ 
rah  had  not  attended  to  relations  between  rela¬ 
tions  in  this  condition.  However,  a  more  de¬ 
tailed  analysis  of  the  temporal  sequence  in 
which  Sarah  placed  the  four  items  on  the  board 
led  us  to  reject  this  pessimistic  conclusion. 

SARAH’S  STRATEGY  FOR 
CONSTRUCTING  ANALOGIES. 

Equating  within-pair  differences.  As  Sa¬ 
rah  selected  items  and  placed  them  on  the 
board,  she  seems  to  have  followed  a  strategy 
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of  equating  the  number  of  within-pair  featur- 
al  differences,  independently  of  the  physical 
nature  of  those  differences.  This  strategy  is 
illustrated  in  Figures  3a  -  3d.  Sarah  consistent¬ 
ly  placed  her  first  two  choices  on  the  same 
horizontal  or  vertical  axis  of  the  analogy  board, 
as  illustrated  in  Figure  3a.  Here,  B’  (choice  2) 
and  A  (choice  1 )  have  been  placed  respective¬ 
ly  in  the  upper  and  lower  recesses  (i.e.,  a  ver¬ 
tical  axis)  on  the  left  hand  side  of  the  board. 
We  can  now  describe  Sarah’s  third  and  fourth 
choices  as  being  placed  adjacent  to  either  her 
first  or  her  second  choices.  In  this  example 
Sarah  placed  item  C  (choice  3)  in  the  upper 


right-hand  recess  adjacent  to  her  second  choice 
(sec  Figure  3b).  Sarah’s  fourth  choice  (A*)  was 
then  placed  in  the  lower  right-hand  recess  ad¬ 
jacent  to  her  first  choice  (sec  Figure  3c).  Thus, 
Sarah’s  last  two  placements  of  her  third  and 
fourth  choices  could  be  described  as  creating 
two  pairs  as  shown  in  Figure  3d.  The  number 
of  featural  differences  within  each  pair  is  the 
same.  That  is,  there  is  one  featural  difference 
in  the  B’  &  C  pair  created  by  Sarah’s  place¬ 
ments  of  her  second  and  third  choices.  The  A 
&.  A’  pair  created  by  her  placements  of  her 
first  and  fourth  choices  similarly  contains  a 
single  featural  difference. 


Figure  3.  An  fflustrafive  sequence  of  Sarah*s  choices  and  placements  in  condition  4. 
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Each  trial  from  Condition  4  of  the  analo¬ 
gy  construction  was  analyzed  in  the  manner 
described  above  (Oden  et  al.,  in  preparation 
b).  The  expected  frequencies  of  each  combi¬ 
nation  of  featural  differences  were  obtained 
by  determining  the  six  possible  outcomes  giv¬ 
en  her  two  initial  choices.  The  observed  fre¬ 
quencies  of  pairings  which  equated  within- 
pair  differences  significantly  exceeded  their 
expected  frequencies. 

Sarah  apparently  followed  a  strategy  of  nu¬ 
merically  equating  within-pair  featural  differenc¬ 
es  as  she  made  her  last  two  selections  and  placed 
them  on  the  board.  When  Sarah  placed  her  third 
choice  next  to  one  of  the  items  already  on  the 
board  the  resulting  number  of  within-pair  fea¬ 
tural  differences  tended  to  be  subsequently 
matched  within  the  pair  created  by  her  placing 
her  fourth  choice  next  to  the  remaining  item. 

We  argue  that  this  pattern  of  results  re¬ 
veals  analogical  reasoning;  it  involves  rea¬ 
soning  about  relations  between  relations. 
There  is  a  difference,  of  course,  between 
the  strategy  employed  by  Sarah  and  the  a 
priori  rules  we  used  to  construct  analogies. 
Whereas  we  had  attended  to  the  nature  of 
specific  features,  as  well  as  their  number, 
Sarah  attended  only  to  the  number  of  fea¬ 
tural  differences.  For  example,  we  regarded 
a  (color+shape)  transformation  as  differing 
from  a  (size+fill)  transformation.  In  Sarah’s 
eyes  these  transformations  were  equivalent 
because  they  both  entailed  two  featural  dif¬ 
ferences.  Thus,  compared  to  our  reasoning, 
Sarah’s  may  lack  rigor,  but  fundamentally, 
she  still  reasoned  about  relations  between 
relations.  We  do  not  believe  that  Sarah’s 
failure  to  attend  to  featural  details  beyond 
number  reflects  a  fundamental  constraint  on 
her  reasoning  abilities.  Recall  that  the  re¬ 
sults  from  Condition  2  of  the  completion 
task  indicated  that  selection  of  items  was  a 
more  difficult  task  than  their  arrangement. 
We  believe  that  the  decline  in  Sarah’s  per¬ 
formance  in  the  present  condition  of  the 
construction  task  resulted  from  the  inherent 
complexity  of  the  5-item  stimulus  array  with 
which  she  was  presented. 


SUMMARY 

Collectively,  the  results  from  these  four 
conditions  not  only  confirm  that  an  adult  chim¬ 
panzee  can  solve  analogies  (Gillan  et  al.,  198 1), 
but  also  demonstrate  that  she  does  so  sponta¬ 
neously,  even  in  situations  where  a  simpler  as¬ 
sociative  strategy  would  suffice. 

In  condition  one  we  replicated  Gillan  et 
als.  (1981)  earlier  findings  which  demonstrat¬ 
ed  that  when  faced  with  a  partially  construct¬ 
ed  analogy  problem  Sarah,  the  same  subject, 
successfully  selected  from  two  available 
choices  that  item  which  would  complete  the 
analogy.  In  condition  2  of  the  completion 
task,  Sarah  demonstrated  conclusively  that 
her  performances  was  mediated  by  analogi¬ 
cal  relationships  and  not  a  simple  associative 
similarity  matching  strategy.  When  present¬ 
ed  with  only  two  elements  of  a  classical  anal¬ 
ogy  problem  she  successfully  chose  from  3 
alternatives  the  two  elements  necessary  to 
complete  the  problem.  More  importantly 
however,  was  the  finding  that  her  spatial  ar¬ 
rangement  of  these  choices  was  guided  by  the 
relation  initially  established  by  the  experi¬ 
menters  and  not  on  the  basis  of  mere  similar¬ 
ity  along  any  single  physical  dimension. 

In  conditions  3  and  4  we  further  demon¬ 
strated  the  same  chimpanzee,  Sarah,  could 
not  only  complete,  but  also  could  construct 
analogies.  When  presented  with  a  random¬ 
ized  grouping  of  elements  from  which  an 
analogy  could  be  constructed  she  proceeded 
to  do  spontaneously.  When  presented  with  the 
minimum  of  4  elements  she  proceeded  to  ar¬ 
range  all  of  them  in  analogical  fashion.  When 
presented  with  5  elements  of  which  4  could 
be  used  to  construct  an  analogy  she  ignored 
the  inappropriate  item  and  successfully  ar¬ 
ranged  the  remaining  items  analogically. 
However,  she  did  so  in  a  manner  analogous 
to,  but  not  identical  with  that  of  her  human 
experimenters.  On  the  one  hand,  we  had  at¬ 
tended  to  both  specific  physical  factors  and 
their  number  in  each  within  pair  transforma¬ 
tion.  Sarah,  on  the  other  hand,  attended  to 
only  the  latter  numerical  dimension. 
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PRECURSORS  FOR  ANALOGICAL 
REASONING 

Some  investigators  have  argued  that  ana¬ 
logical  reasoning  is  the  common  foundation 
(denominator)  of  much  of  human  reasoning 
including  logical  inference  (e.g.,  Halford,  1992). 
Our  results  confirm  earlier  reports  (Gillan  et  al., 
1981)  that  it  is  well  within  the  capabilities  of  at 
least  one  adult  chimpanzee.  Might  this  capaci¬ 
ty  be  expected  in  chimpanzees  other  than  Sa¬ 
rah?  Our  answer  is  a  qualified  yes.  Prior  to  her 
experience  with  formal  analogical  problem 
solving  Sarah  had  mastered  a  conceptual  match¬ 
ing  task  (Premack,  1978)  which,  at  the  age  of 
39  years,  she  still  successfully  performed  un¬ 
der  conditions  of  nondifferential  reinforcement 
(Thompson,  Oden  &  Boysen,  1998). 

In  the  conceptual  matching  task  a  subject  is 
required  to  match  a  pair  of  physically  identical 
sample  items  (e.g.,  a  pair  of  locks)  with  another 
pair  of  identical  items  (e.g.,  a  pair  of  cups)  as 
opposed  to  a  pair  on  physically  nonidentical 
items  like,  for  example,  a  pencil  and  an  eraser. 
Conversely,  this  latter  nonidentical  pair  would 
be  the  correct  match  given  another  nonidentical 
sample  pair  such  as  a  shoe  and  ball.  Successful 
performance  of  the  conceptual  matching  task 
described  above  involves  the  matching  of  rela¬ 
tions  between  relations.  It  is  then  in  essence  a 
form  of  analogy  in  which  all  the  arguments  are 
provided  for  the  subject.  We  believe,  therefore, 
that  any  chimpanzee  capable  of  performing  the 
conceptual  matching  task  possesses  the  compu¬ 
tational  cognitive  foundations  upon  which  for¬ 
mal  analogical  reasoning  rests. 

There  is  good  evidence,  however,  that  not 
all  chimpanzees,  can  match  relations  between 
relations  despite  their  success  on  physical 
matching  tasks.  Some  prior  experience  with 
tokens  which  symbolize  abstract  same/differ¬ 
ent  relations  is  apparently  a  necessary  prereq¬ 
uisite  for  the  explicit  expression  by  a  chimpan¬ 
zee  of  their  otherwise  only  implicit  knowledge 
about  relations  between  relations  (Prcmack, 
1983;  Thompson  et  al.  1998).  Presumably,  ex¬ 
perience  with  external  symbol  systems  in  some 
way  provides  the  necessary  representational 


scaffolding  for  the  complex  computational  op¬ 
erations  involved  in  solving  problems  involv¬ 
ing  conceptually  abstract  similarity  judgments 
as  in  analogies  (Clark  &  Thornton,  1997;  Cen¬ 
tner  &  Markman,  1997;  Sternberg  &  Nigro, 
1980).  Interestingly,  this  is,  as  yet,  no  evidence 
that  old-world  macaque  monkeys  can  perceive, 
let  alone  judge,  analogical  relations  (Thomp¬ 
son  &  Oden,  1996;  Thompson  &  Oden,  1998; 
Washburn,  Oden  &  Thompson,  1997). 

CONCLUSION 

The  results  described  here  on  analogical 
problem  solving  by  Sarah  demonstrate  that  this 
chimpanzee  is  predisposed,  as  are  adult  humans, 
to  reason  about  relations  between  relations. 
There  was  no  evidence  in  the  completion  and 
construction  analogy  tasks  summarized  above 
that  Sarah  attempted  to  use  a  less  efficient  as¬ 
sociative  strategy,  as  can  occur  with  young  chil¬ 
dren  (Alexander  et  al,  1989).  If  analogical  rea¬ 
soning  is  indeed  a  hallmark  of  human  reason¬ 
ing  then  its  demonstration  in  a  chimpanzee 
should  not  be  surprising  to  anyone  comfortable 
with  a  perspective  on  the  origins  of  human  cog¬ 
nition  in  which  evolutionary  and  cultural  fac¬ 
tors  are  conjoined. 
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Analogical  thinking  is  the  basis  of  much 
of  our  everyday  problem  solving.  ‘Analogy 
pervades  all  our  thinking,  our  everyday 
speech  and  our  trivial  conclusions  as  well  as 
artistic  ways  of  expression  and  the  highest 
scientific  achievements’  (Polya,  1957).  The 
central  role  of  analogy  in  human  cognition 
underlines  the  importance  of  understanding 
the  development  of  reasoning  by  analogy  in 
children.  However,  until  fairly  recently,  there 
was  little  interest  in  analogical  development 
among  researchers  in  child  psychology. 

This  was  because  the  most  famous  de¬ 
velopmental  psychologist,  Piaget,  had  argued 
that  analogical  skills  did  not  develop  until 
early  adolescence,  and  this  conclusion  had 
not  been  challenged.  Rather  than  seeing  anal¬ 
ogy  as  a  fundamental  cognitive  process,  Piag¬ 
et  saw  analogy  as  a  sophisticated  reasoning 
strategy  that  emerged  after  the  primary  years. 
The  main  reason  was  that,  according  to  Piag¬ 
et’s  general  theory  of  logical  development, 
the  ability  to  see  relations  between  relations 
(to  use  ‘higher-order  relations’)  was  a  hall¬ 
mark  of  the  final  stage  of  logical  reasoning, 
called  the  ‘formal  operational’  stage.  Formal 
operational  reasoning  required  children  to 
operate  mentally  on  the  results  of  simpler 
operations,  A  simpler  operation  was  finding 
relations  between  objects  (these  simpler  log¬ 
ical  operations  were  called  ‘concrete  opera¬ 
tions’).  As  analogies  required  children  to  rea¬ 
son  about  relational  similarity  rather  than 
about  relations  between  objects,  it  appeared 
to  be  a  typical  formal  operational  skill. 

Piaget’s  theory  of  logical  development  is 
the  most  widely-taught  theory  in  cognitive  de¬ 
velopmental  psychology  and  in  education.  It 
has  also  been  used  as  a  basis  for  research  in 


many  related  areas  (e.g.,  in  theorising  about 
the  cognitive  processes  in  reading  develop¬ 
ment).  If  Piaget’s  conclusions  about  the  rela¬ 
tive  mental  sophistication  of  analogical  rea¬ 
soning  turn  out  to  be  incorrect,  then  the  impli¬ 
cations  for  educational  practice  are  immense. 

Piaget’s  conclusions  were  based  on  ex¬ 
periments  using  a  pictorial  version  of  the 
standard  test  for  analogical  reasoning  (used 
in  IQ  testing),  the  ‘item  analogy’.  In  item 
analogies,  two  items  A  and  B  are  presented 
to  the  child,  a  third  item  C  is  presented,  and 
the  child  is  required  to  generate  a  D  term  that 
has  the  same  relation  to  C  as  B  has  to  A.  Suc¬ 
cessful  generation  of  a  D  term  requires  the 
use  of  the  relational  similarity  constraint.  For 
example,  if  the  child  is  given  the  items  "cat 
is  to  kitten  as  dog  is  to  ?’,  she  is  expected  to 
generate  the  solution  term  ‘puppy’.  The  re¬ 
sponse  ‘bone’,  which  is  a  strong  associate  of 
dog,  would  be  an  error.  Another  example  is 
the  analogy  "Bicycle  is  to  handlebars  as  ship 
is  to  ?’.  Here  the  relation  constraining  the 
choice  of  a  D  term  is  ‘steering  mechanism’, 
and  so  a  child  who  offered  the  completion 
term  ‘bird’  would  not  be  credited  with  un¬ 
derstanding  the  relational  similarity  con¬ 
straint.  Piaget’s  theory  that  analogical  reason¬ 
ing  was  absent  in  children  until  adolescence 
was  based  on  item  analogies  such  as  these. 
Younger  children  tested  by  Piaget  offered 
solutions  like  ‘bird’  to  the  bicycle/ship  anal- 
ogy,  giving  reasons  like  ‘both  birds  and  ships 
are  found  on  the  lake’.  Piaget’s  interpreta¬ 
tion  of  his  research  was  that  younger  chil¬ 
dren  solved  analogies  on  the  basis  of  associ¬ 
ations.  Children  only  became  able  to  reason 
on  the  basis  of  relational  similarity  at  around 
11-12  years  of  age. 
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THE  ROLE  OF  RELATIONAL 

FAMILIARITY  IN  ANALOGICAL 
DEVELOPMENT 

Closer  inspection  of  Piaget’s  experimen¬ 
tal  methods  suggest  a  serious  flaw,  however. 
Piaget  had  not  checked  whether  the  younger 
children  in  his  experiments  understood  the 
relations  on  which  his  analogies  were  based 
(relations  such  as  ‘steering  mechanism*). 
Their  failure  to  solve  the  item  analogies  in 
his  experiments  could  thus  have  arisen  from 
a  lack  of  knowledge  of  the  relations  being 
used.  Item  analogies  based  on  unfamiliar 
relations  would  obviously  underestimate 
analogical  ability. 

The  solution  is  to  design  analogies  based 
on  relations  that  are  known  to  be  highly  fa¬ 
miliar  to  younger  children  from  cognitive  de¬ 
velopmental  research.  Simple  causal  relations 
such  as  melting,  wetting  and  cutting  are  known 
to  be  understood  between  the  ages  of  3  and  4 
years,  and  relations  between  real  world  objects 
such  as  *  trains  go  on  tracks*  and  *birds  live  in 
nests*  are  familiar  to  4-  and  5-year-olds.  Item 
analogies  such  as  *p!aydoh  is  to  cut  playdoh 
as  apple  is  to  cut  apple  *  and  *bird  is  to  nest  as 
dog  is  to  doghouse*  can  thus  be  used  to  exam¬ 
ine  whether  3-  to  5-year-olds  have  the  ability 
to  reason  by  analogy. 

For  this  young  age  group,  a  picture-based 
version  of  the  item  analogy  task  was  developed 
(Goswami&  Brown,  1989, 1990).  The  task  was 
presented  as  a  ‘game*  about  matching  pictures. 
The  children  were  shown  a  ‘game  board’  with 
four  slots  for  pictures,  the  slots  being  grouped 
in  two  pairs  for  the  A:B  and  C:D  parts  of  the 
analogy.  As  the  children  watched,  the  experi¬ 
menter  presented  the  first  three  terms  of  a  giv¬ 
en  analogy  (e.g.,  pictures  of  a  bird  f  A],  a  nest 
[B],  and  a  dog  [C]).  As  the  pictures  were  pre¬ 
sented,  the  child  was  asked  to  name  each  one 
to  ensure  that  they  were  familiar.  The  child  was 
then  asked  to  predict  the  picture  that  was  need¬ 
ed  to  finish  the  pattern.  This  was  intended  to 
see  whether  children  could  generate  an  analog¬ 
ical  solution  spontaneously,  without  seeing  the 
solution  pictures. 


Following  this,  the  experimenter  showed 
the  child  a  choice  of  solution  terms.  For  the 
bird/dog  analogy,  these  were  pictures  of  adng^ 
house,  a  cat,  another^og,  and  a  hone.  The  dif¬ 
ferent  choices  were  designed  to  test  different 
theories  of  analogical  development.  The  cor¬ 
rect  choice,  which  would  indicate  analogical 
ability,  was  the  doghouse.  The  associative 
choice  was  the  bone.  Selection  of  the  bone 
would  be  expected  if  younger  children  rely  on 
associative  reasoning  to  solve  analogies,  as 
Piaget  had  claimed.  The  other  choices  were  a 
‘mere  appearance  match*  choice  (the  second 
dog),  and  a  semantic  match  (the  cat).  ‘Merc 
appearance*  matching  is  a  term  coined  by 
Centner  (1989)  to  refer  to  the  matching  of 
object  or  ‘surface*  similarities  when  attempt¬ 
ing  to  solve  analogies  (such  as  choosing  an¬ 
other  dog  to  match  the  dog  in  the  C  term). 
Centner  has  suggested  that  younger  children 
might  rely  on  object  similarity  rather  than  re¬ 
lational  similarity  in  reaching  analogical  so¬ 
lutions  (Centner,  1989). 

The  picture  matching  game  showed  that  all 
children  tested  (4-,  5-  and  9-year-olds)  per¬ 
formed  at  levels  significantly  above  chance  in 
the  analogy  task,  selecting  the  correct  comple¬ 
tion  term  597c,  667c  and  947c  of  the  time  re¬ 
spectively.  There  was  no  evidence  of  mere  ap¬ 
pearance  matching.  Although  many  younger 
children  were  shy  of  making  predictions  prior 
to  seeing  the  solution  choices,  those  who  were 
more  confident  showed  clear  analogical  abili¬ 
ty  on  this  measure  as  well.  For  example,  when 
4-year-old  Lucas  was  given  the  analogy  bird  is 
to  nest  as  dog  is  to  ?,  he  first  predicted  that  the 
correct  solution  was  puppy.  He  argued,  quite 
logically,  “Bird  lays  eggs  in  her  nest  fthc  nest 
in  the  B-term  picture  contained  three  eggs]  - 
dog  -  dogs  lay  babies,  and  the  babies  arc  -  umm 
-  and  the  name  of  the  babies  is  puppy!”  Lucas 
had  used  the  relation  type  of  offspring  to  solve 
the  analogy,  and  was  quite  certain  that  he  was 
correct.  He  continued  “I  don’t  have  to  look  [at 
the  solution  pictures]  —  the  name  of  the  baby 
is  puppy!”  Once  he  looked  at  the  different  so¬ 
lution  options,  however,  he  decided  that  ihe  dog 
house  was  the  correct  response. 
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The  matching  game  also  included  a  con¬ 
trol  task  to  ensure  that  the  correct  solution  to 
the  analogy  was  not  simply  the  most  attractive 
pictorial  match  for  the  C  term  picture.  Here  the 
children  were  simply  shown  the  C  term  picture 
along  with  the  correct  solution  term  and  the 
distractors,  and  were  asked  to  choose  which 
picture  ‘went  best’  with  the  C  term  picture.  For 
example,  the  children  were  shown  the  picture 
of  the  dog,  and  were  asked  to  choose  the  best 
match  from  the  pictures  of  the  doghouse,  bone, 
second  dog  and  cat.  In  this  unconstrained  task, 
the  children  were  as  likely  to  select  the  asso¬ 
ciative  match  (bone)  as  the  analogy  match  (dog¬ 
house).  Additionally,  although  the  children 
readily  agreed  that  another  match  could  be  cor¬ 
rect  in  the  control  condition  (9  year  olds:  76%, 
4  year  olds:  82%),  they  were  not  so  flexible  in 
the  analogy  condition,  where  most  of  them  said 
that  only  one  answer  could  be  correct  (9  year 
olds:  89%,  4  year  olds:  60%).  This  shows 
awareness  of  the  relational  similarity  constraint 
that  governs  truly  analogical  responding.  The 
children  understood  that  the  correct  completion 
term  for  the  analogy  had  to  link  the  C  and  D 
terms  by  the  same  relation  that  linked  the  A 
and  B  terms.  Notice  that  Lucas  was  using  the 
relational  similarity  constraint  when  he  gener¬ 
ated  the  solution  ‘puppy’  for  the  bird/dog  anal¬ 
ogy.  This  cognitive  flexibility  displays  a  full 
understanding  of  analogy,  and  provides  evi¬ 
dence  of  truly  mental  operations,  thereby  meet¬ 
ing  Piaget’s  original  criteria  for  the  presence  of 
‘true’  analogical  reasoning. 

From  the  picture  analogy  game,  we  know 
that  the  ability  to  reason  by  analogy  is  present 
by  at  least  age  4.  However,  the  analogy  game 
may  still  have  underestimated  analogical  abili¬ 
ty.  This  is  because  relational  familiarity  was  not 
measured  independently  of  analogical  success. 
Instead,  it  was  simply  assumed  that  familiar  re¬ 
lations  had  been  selected  for  the  analogies,  leav¬ 
ing  open  the  possibility  that  the  younger  chil¬ 
dren  may  have  failed  in  some  trials  because  the 
relations  used  in  those  particular  analogies  were 
unfamiliar  to  them.  Alternatively,  some  children 
may  have  failed  some  analogies  because  they 
were  actually  reasoning  about  relations  that  were 


differentfrom  those  intended  by  the  experiment¬ 
er —  like  Lucas. 

THE  relationship  BETWEEN 
relational  KNOWLEDGE  AND 
ANALOGICAL  RESPONDING 

,  The  idea  that  children’s  analogical  perfor¬ 
mance  depends  on  their  relational  knowledge 
has  been  called  the  *  relational  familiarity*  hy¬ 
pothesis.  In  order  to  establish  whether  chil¬ 
dren’s  use  of  analogical  reasoning  is  knowl¬ 
edge-based,  dependent  on  relational  familiari¬ 
ty  rather  than  analogical  ability,  relational 
knowledge  as  well  as  analogical  ability  needs 
to  be  assessed.  This  can  be  done  by  changing 
the  control  task  in  the  picture  matching  game. 
The  appropriate  control  task  measures  chil¬ 
dren’s  knowledge  of  the  relations  being  used 
in  the  analogies  that  are  presented  in  the  item 
analogy  task. 

A  second  set  of  analogy  experiments  using 
the  picture  matching  game  were  thus  carried 
out  to  test  the  relational  familiarity  hypothesis. 
This  time,  item  analogies  based  on  physical 
causal  relations  like  melting,  cutting  and  wet¬ 
ting  were  used.  These  relations  are  acquired 
early  in  development,  between  3  and  4  years  of 
age.  Children  were  given  analogies  like  'choc¬ 
olate  is  to  melted  chocolate  as  snowman  is  to 
?’,  and  'playdoh  is  to  cut  play doh  as  apple  is  to 
?’.  Knowledge  of  the  causal  relations  required 
to  solve  the  analogies  was  measured  by  giving 
the  children  pictures  of  items  that  had  been 
causally  transformed  (e.g.,  cut  playdoh,  cut 
bread,  cut  apple),  and  asking  them  to  select  the 
causal  agent  responsible  for  the  change  from  a 
set  of  pictures  of  possible  agents  (e.g.,  a  knife, 
water,  the  sun). 

This  ‘causal  relations’  version  of  the  picture 
matching  game  was  given  to  children  aged  3,  4 
and  6  years  of  age.  The  results  showed  that  both 
analogical  success  and  causal  relational  knowl¬ 
edge  increased  with  age.  The  3-year-olds  solved 
52%  of  the  analogies  and  52%  of  the  control 
sequences,  the  4-year-olds  solved  89%  of  the 
analogies  and  80%  of  the  control  sequences,  and 
the  6-year-olds  solved  99%  of  the  analogies  and 
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100%  of  the  control  sequences.  There  was  also 
a  significant  conditional  relationship  between 
performance  in  the  two  conditions,  as  would  be 
predicted  by  the  relational  familiarity  hypothe¬ 
sis.  This  conditional  relationship  showed  that 
individual  children’s  performance  in  the  analo¬ 
gy  task  was  intimately  linked  to  those  individu¬ 
al  children’s  knowledge  of  the  corresponding 
causal  relations.  Analogical  success  had  thus 
been  shown  to  be  highly  dependent  on  relation¬ 
al  knowledge.  These  experiments  showed  that 
Piaget’s  theory  of  analogical  development  could 
no  longer  be  upheld.  If  analogy  is  one  of  the  ba¬ 
sic  cognitive  processes  underlying  intellectual 
development,  then  it  should  be  found  at  work  in 
many  other  areas  of  cognition. 

ANALOGIES  IN  COGNITIVE 
DEVELOPMENT 

Analogies  in  Piagetian  Tasks 

An  elegant  theory  of  how  analogical  rea¬ 
soning  may  contribute  to  performance  in  Piag¬ 
etian  logical  tasks  has  been  proposed  by  Hal¬ 
ford  (1993).  Halford’s  basic  claim  is  that  much 
logical  reasoning  is  analogical.  According  to 
his  theory,  children  can  use  representations  of 
everyday  relational  structures  as  a  basis  for 
analogies  to  new,  isomorphic  problems  that 
share  the  same  relational  structures.  For  ex¬ 
ample,  in  order  to  solve  a  Piagetian  transitive 
inference  problem  of  the  form  Tom  is  happier 
than  Bill,  Bill  is  happier  than  John,  who  is 
happiest?  a  child  can  use  an  analogy  from  a 
familiar  ordered  stucture  that  may  already  be 
represented  in  memory.  An  example  is  the 
ordering  structure  A  above  B  above  C.  Hal¬ 
ford  has  suggested  that  all  of  Piaget’s  logical 
tasks  that  are  characteristic  of  the  ‘concrete 
operational*  stage  of  logical  development 
(transitive  reasoning,  class  inclusion,  conser¬ 
vation)  require  analogical  mappings  based  on 
pairs  of  relations. 

In  order  to  test  the  Idea  that  Piagetian  ‘con¬ 
crete  operational’  ta.sks  can  be  solved  by  using 
appropriate  analogies,  therefore,  we  must  first 
examine  children’s  ability  to  map  pairs  of  rela¬ 


tions.  This  can  be  done  by  extending  the  classi¬ 
cal  analogy  task  by  linking  the  A  and  B  terms 
by  two  relations  rather  than  one.  Goswami, 
Lccvers,  Pressley  and  Wheelwright  (1998)  de¬ 
signed  a  set  of  analogies  based  on  pairs  of  phys¬ 
ical  causal  relations,  extending  the  technique 
used  by  Goswami  and  Brown  ( 1 989).  We  asked 
3-,  4-,  5-  and  6-year-old  children  to  make  rela¬ 
tional  mappings  based  on  either  single  causal 
relations  like  cut,  paint,  and  wet,  or  pairs  of 
causal  relations,  like  cut  +  wet  and  mend  + 
paint.  This  experimental  paradigm  provides  a 
relatively  pure  test  of  the  ability  to  make  anal¬ 
ogies  about  pairs  of  relations. 

Our  experiment  had  four  conditions,  a  sin¬ 
gle-relation  analogy  condition  (e.g.,  apple:  cut 
apple::  hair:  cut  hair),  a  double -relation  anal¬ 
ogy  condition  (e.g.,  apple:  cut,  wet  apple::  hair: 
cut,  wet  hair),  a  single-relation  control  condi¬ 
tion  and  a  double-relation  control  condition.  In 
the  control  conditions,  the  children  were  asked 
to  select  the  picture  of  the  causal  agent  or  the 
pair  of  causal  agents  responsible  for  the  causal 
changes  shown  in  the  analogies,  following  Gos¬ 
wami  and  Brown  (1989). 

Children’s  performance  in  the  analogy  and 
the  control  conditions  was  then  examined  as  a 
function  of  Condition  and  Age.  The  pattern  of 
the  results  was  remarkably  similar  to  the  pat¬ 
tern  found  in  the  causal  relations  analogies  used 
by  Goswami  and  Brown  (1989).  There  W'as  a 
close  correspondence  between  analogy  perfor¬ 
mance  and  performance  in  the  relational  knowl¬ 
edge  control  conditions  for  both  the  single  re¬ 
lation  and  the  double  relation  analogies.  For  the 
single  relation  conditions,  the  3-year-olds 
solved  33%  of  the  analogies  and  46%  of  the 
control  sequences,  the  4-year-olds  solved  51% 
of  the  analogies  and  63%  of  the  control  se¬ 
quences,  the  5-year-olds  solved  72%  of  the 
analogies  and  76%  of  the  control  sequences, 
and  the  6-year-o1ds  solved  89%  of  the  analo¬ 
gies  and  88%  of  the  control  sequences.  For  the 
double  relation  conditions,  the  3-year-olds 
solved  13%  of  the  analogies  and  31%  of  the 
control  sequences,  the  4-year-olds  solved  50% 
of  the  analogies  and  50%  of  the  control  se¬ 
quences,  the  5-year-olds  solved  62%  of  the 
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analogies  and  74%  of  the  control  sequences, 
and  the  6-year-olds  solved  78%  of  the  analo¬ 
gies  and  91%  of  the  control  sequences.  Analy¬ 
ses  demonstrated  no  interaction  between  age 
and  number  of  relations,  although  the  main  ef¬ 
fect  of  number  of  relations  almost  reached  sig¬ 
nificance,  reflecting  the  fact  that  children  of  all 
ages  found  the  double  relation  analogies  and 
control  sequences  more  difficult  than  the  sin¬ 
gle  relation  analogies  and  control  sequences. 
Goswami  et  al.  concluded  that  the  ability  to 
solve  analogies  based  on  pairs  of  relations  was 
governed  by  relational  familiarity.  As  long  as 
familiar  relational  structures  are  chosen  as  a 
basis  for  analogy,  therefore,  young  children 
should  be  able  to  use  analogies  to  help  them  to 
solve  Piagetian  reasoning  tasks. 

Analogies  in  a  Transitive  Mapping  Task 

Halford  has  suggested  that  familiar  ordered 
structures  may  provide  useful  analogies  for  tran¬ 
sitive  reasoning  tasks.  Family  members  provide 
a  familiar  example  of  an  ordering  structure 
based  on  size,  as  in  most  families  the  father  (F) 
is  taller  than  the  mother  (M),  and  the  mother  is 
taller  than  the  young  child  (C).  If  knowledge  of 
the  familiar  relational  structure  F  >  M  >  C  is 
present  in  young  children,  then  children  who 
have  mentally  represented  this  relational  struc¬ 
ture  should  be  able  to  solve  transitive  mapping 
tasks  using  less  familiar  relations. 

Goswami  (1995)  examined  this  hypothe¬ 
sis  using  Goldilocks  and  the  Three  Bears  as  a 
familiar  example  of  family  size  relations  (Dad¬ 
dy  Bear  >  Mummy  Bear  >  Baby  Bear).  Three- 
and  4-year-old  children  were  asked  to  use  the 
relational  structure  represented  by  the  Three 
Bears  as  a  basis  for  solving  transitive  ordering 
problems  involving  perceptual  dimensions  such 
as  temperature,  loudness,  intensity,  and  width. 
The  transitive  mapping  test  was  presented  by 
asking  the  children  to  imagine  going  to  the 
Three  Bears*  house,  and  then  to  imagine  look¬ 
ing  at  their  different  belongings.  This  imagina¬ 
tion  task  constituted  a  fairly  abstract  test.  For 
example,  the  imaginary  bowls  of  the  Three 
Bears’  porridge  could  be  eitherboiling  hot,  hot, 
or  warm,  and  the  child  had  to  decide  which 


bowl  of  porridge  belonged  to  which  bear.  In 
order  to  give  the  correct  answer,  the  child  had 
to  map  the  transitive  height  ordering  of  Daddy, 
Mummy,  and  Baby  Bear  to  the  different  por¬ 
ridge  temperatures,  giving  Daddy  Bear  the  boil¬ 
ing  hot  porridge.  Mummy  Bear  the  hot  porridge, 
and  Baby  Bear  the  warm  porridge  (these  map¬ 
pings  do  not  follow  the  original  fairy  tale,  in 
which  Daddy  Bear’s  porridge  was  too  salty,  and 
Mummy  Bear’s  was  too  sweet). 

The  results  showed  that  the  percentage  of 
correctly  ordered  mappings  approached  ceiling 
for  the  4-year-olds  for  most  of  the  dimensions 
used.  The  lowest  levels  of  performance  oc¬ 
curred  for  width  (of  beds,  62%  correct),  and 
hardness  (of  chairs,  76%  correct),  and  the  high¬ 
est  occurred  for  temperature  (of  porridge,  95% 
correct).  Performance  with  the  width  dimen¬ 
sion  (wide  bed,  medium  bed,  narrow  bed)  was 
possibly  affected  by  worries  that  a  baby  could 
fall  out  of  a  narrow  bed,  as  many  children  allo¬ 
cated  the  medium  bed  to  Baby  Bear.  They  were 
then  left  without  a  bed  for  Mummy  Bear.  The 
3-year-olds  produced  correctly  ordered  map¬ 
pings  for  only  some  of  the  dimensions,  perfor¬ 
mance  being  above  chance  (17%)  for  the  di¬ 
mensions  of  temperature  of  porridge  (31  %  cor¬ 
rect),  pitch  of  voice  (31%  correct),  and  height 
of  mirrors  (62%  correct,  but  an  isomorphic  re¬ 
lation).  Relational  familiarity  and  real-world 
knowledge  about  family  size  relations  seem  to 
have  helped  the  3-year-olds  with  these  particu¬ 
lar  dimensions.  The  children  are  unlikely  to 
have  based  their  correct  mappings  on  the  story, 
as  none  of  these  dimensions  was  mentioned  in 
the  Three  Bears  book  that  was  read  to  them  as 
part  of  the  study. 

Analogies  in  a  Class  Inclusion  Task 

Families  also  provide  a  familiar  example 
of  an  inclusive  relationship,  as  family  mem¬ 
bers  can  be  divided  into  two  distinct  sub-sets, 
parents  and  children,  both  of  which  are  mem¬ 
bers  of  the  total  set  of  family  members  (Hal¬ 
ford,  1993).  In  order  to  see  whether  the  fami¬ 
ly  as  a  familiar  example  of  inclusive  relations 
could  act  as  a  basis  for  successful  performance 
in  Piagetian  class  inclusion  tasks,  Goswami, 
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Pauen  and  Wilkening(1996)  devised  the  *cre- 
ate-a-family*  paradigm.  In  this  paradigm,  chil¬ 
dren  were  shown  a  toy  family,  for  example  a 
family  of  toy  mice  (2  large  mice  as  parents,  3 
small  mice  as  children).  Their  job  was  to  cre¬ 
ate  analogous  families  (2  patents  and  3  chil¬ 
dren)  from  an  assorted  pile  of  toys  (such  as 
toy  cars,  spinning  tops,  balls  and  helicopters). 
After  the  children  had  correctly  created  4  anal¬ 
ogous  families,  they  were  given  4  class  inclu¬ 
sion  problems  involving  toy  frogs,  sheep, 
building  blocks  and  balloons.  The  class  inclu¬ 
sion  problems  were  posed  using  collection 
terms  (‘group*,  ‘herd’,  ‘pile*,  ‘bunch’).  The 
children  in  Goswami  et  al.’s  study  (4-  to  5- 
year-olds)  had  all  failed  the  traditional  Piage- 
tian  class  inclusion  task,  which  was  given  as  a 
pretest  (“Are  there  more  red  flowers  or  more 
flowers?’’).  A  control  group  of  children  re¬ 
ceived  the  same  class  inclusion  problems  us¬ 
ing  collection  terms,  but  did  not  receive  the 
‘create-a-family*  analogy  training  session. 

Goswami  et  al.  found  that  more  children 
in  the  ‘create-a-family’  analogy  condition  than 
in  the  control  condition  solved  at  least  3  of  the 
4  class  inclusion  problems  involving  frogs, 
sheep,  building  blocks  and  balloons.  This  ef¬ 
fect  was  particularly  striking  at  age  4,  in  which 
no  improvement  at  all  was  found  in  the  control 
group  with  the  collection  term  wording.  It 
should  be  remembered  that  all  of  the  children 
had  previously  failed  Piagetian  class  inclusion 
tasks.  Goswami  et  al.  argued  that  this  improve¬ 
ment  was  a  result  of  the  use  of  analogies  based 
on  a  representation  of  family  structure. 

Analogies  in  Foundational  Domains 

One  popular  view  of  cognitive  develop¬ 
ment  is  that  conceptual  development  can  be 
understood  in  terms  of  three  ‘foundational’ 
domains.  These  are  the  domains  of  naive  biol¬ 
ogy,  naive  physics,  and  naive  psychology  (Well¬ 
man  &  Gelman,  in  press).  Wellman  and  Gel- 
man  argue  that,  rather  than  developing  a  mono¬ 
lithic  understanding  of  the  world,  young  chil¬ 
dren  develop  distinct  conceptual  frameworks 
to  describe  these  ‘foundational’  domains,  even 


though  many  concepts  will  be  represented  in 
more  than  one  of  these  foundational  frame¬ 
works  (for  example,  persons  are  psychological 
entities,  biological  entities  and  physical  enti¬ 
ties).  Wellman  and  Gelman  suggest  that  chil¬ 
dren  will  use  at  least  two  levels  of  analysis  with¬ 
in  any  framework,  one  that  captures  surface 
phenomena  (mappings  based  on  attributes)  and 
another  that  penetrates  to  deeper  levels  (map¬ 
pings  based  on  relations).  This  means  that  anal¬ 
ogies  should  be  at  work  within  foundational 
domains.  Although  no-one  has  yet  studied  the 
role  of  analogies  in  the  foundational  domain  of 
psychology  (‘theory  of  mind*),  studies  of  the 
role  of  analogies  in  developing  conceptual  un¬ 
derstanding  in  the  domains  of  naive  biology  and 
naive  physics  can  be  found. 

Analogy  as  a  Mechanism  for  Understanding 
Biological  Principles 

Evidence  that  analogy  is  an  important 
mechanism  for  understanding  biological  prin¬ 
ciples  comes  from  a  series  of  studies  by  Inaga- 
ki  and  her  colleagues.  They  were  interested  in 
how  often  children  would  base  their  predictions 
about  biological  phenomena  on  analogies  to 
people:  the  ‘personification’  analogy.  As  hu¬ 
man  beings  are  the  biological  kinds  bc.st  known 
to  young  children,  it  seems  plausible  that  chil¬ 
dren  may  use  their  biological  knowledge  about 
people  to  understand  biological  phenomena  in 
other  natural  kinds.  For  example,  Inagaki  and 
Sugiyama  (1988)  asked  4-,  5-,  8-  and  lO-year- 
olds  a  range  of  questions  about  various  proper¬ 
ties  of  8  target  objects,  including  “Docs  x 
breathe?’’,  “Docs  x  have  a  heart’’,  “Docs  x  feel 
pain  if  we  prick  it  with  a  needle’’,  and  “Can  x 
think?’’.  The  target  objects  were  people,  rab¬ 
bits,  pigeons,  fl.sh,  grasshoppers,  trees,  tulips 
and  stones.  Prior  similarity  judgements  had  es¬ 
tablished  that  the  target  objects  differed  in  their 
similarity  to  people  in  this  order,  with  rabbits 
being  rated  as  most  similar  and  stones  being 
rated  as  least  similar.  The  children  all  showed 
a  decreasing  tendency  to  attribute  the  physio¬ 
logical  properties  (“Docs  x  breathe’’)  to  the  tar¬ 
get  objects  as  the  perceived  similarity  to  a  per- 
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son  decreased.  Apart  from  the  4-year-olds,  very 
few  children  attributed  physiological  attributes 
to  stones,  tulips  and  trees,  and  even  4-year-olds 
only  attributed  physiological  properties  to 
stones  15%  of  the  time.  A  similar  pattern  was 
found  for  the  mental  properties  (“Can  x 
think?”)-  This  study  supports  the  idea  that  pre¬ 
schoolers’  understanding  of  biological  phenom¬ 
ena  arises  from  analogies  based  on  their  under¬ 
standing  of  people. 

Analogy  as  a  Mechanism  for  Understanding 
Physical  Principles 

Evidence  that  analogy  is  an  important 
mechanism  for  understanding  physical  prin¬ 
ciples  comes  from  a  series  of  studies  by  Pauen 
and  her  colleagues.  Pauen  has  studied  chil¬ 
dren’s  understanding  of  the  principles  govern¬ 
ing  the  interaction  of  forces,  by  using  a  spe¬ 
cial  apparatus  called  the  ‘force  table’.  The 
force  table  consists  of  an  object  that  is  fixed 
at  the  centre  of  a  round  platform.  Two  forces 
act  on  this  object,  both  represented  by  plates 
of  weights.  The  plates  of  weights  hang  from 
cords  attached  to  the  central  object  at  either 
45’,  75’  or  105'  to  each  other.  The  children’s 
job  is  to  work  out  the  trajectory  of  the  object 
once  it  is  released  from  its  fixed  position.  Their 
predictions  concerning  this  trajectory  are 
scored  in  terms  of  whether  they  consider  only 
a  single  force  (plate  of  weights),  or  whether 
they  integrate  both  forces  in  order  to  deter¬ 
mine  the  appropriate  trajectory.  The  force  ta¬ 
ble  problem  is  presented  to  the  children  in  the 
context  of  a  story  about  a  King  (central  ob¬ 
ject)  who  has  got  tired  of  skating  on  a  frozen 
lake  (the  platform)  and  who  wants  to  be  pulled 
into  his  royal  bed  on  the  shore.  Children  aged 
6,  7,  8  and  9  years  of  age  were  tested. 

Pauen  found  that  most  of  the  younger  chil¬ 
dren  (80  -  85%)  predicted  that  the  king  would 
move  in  the  direction  of  the  stronger  force  only 
(the  larger  plate  of  weights).  An  ability  to  con¬ 
sider  the  two  forces  simultaneously  was  only 
shown  by  some  of  the  9-year-olds  (45%).  Such 
integration  rule  responses  were  shown  by  the 
majority  of  the  adults  tested  (63%).  Pauen  spec¬ 


ulated  that  this  may  have  been  because  the  chil¬ 
dren  who  received  the  plates  of  weights  applied 
a  balance  scale  analogy  to  the  force  integration 
problem.  A  balance  scale  analogy  gives  rise  to 
one-force-only  solutions,  which  are  incorrect. 

This  idea  about  the  balance  scale  analogy 
was  prompted  by  the  comments  of  the  children 
themselves,  who  said  that  the  force  table  re¬ 
minded  them  of  a  balance  scale  (presumably 
because  of  the  plates  of  weights).  This  led  Pauen 
to  propose  that  the  children  were  using  sponta¬ 
neous  analogies  in  their  reasoning  about  the 
physical  laws  underlying  the  force  table,  anal¬ 
ogies  that  were  in  fact  misleading.  To  investi¬ 
gate  this  idea  further,  Pauen  and  Wilkening  (in 
press)  gave  9-year-old  children  a  training  ses¬ 
sion  with  a  balance  scale  prior  to  giving  them 
the  force  table  problem.  One  group  of  children 
received  training  with  a  traditional  balance 
scale,  in  which  they  learned  to  apply  the  one- 
force-only  rule,  and  a  second  group  of  children 
received  training  with  a  modified  balance  scale 
that  had  its  centre  of  gravity  below  the  axis  of 
rotation  (a  ‘swing  boat’  suspension).  This  mod¬ 
ified  balance  scale  provided  training  in  the  in¬ 
tegration  rule,  as  the  swing  boat  suspension 
meant  that  even  though  the  beam  rotated  to¬ 
wards  the  stronger  force,  the  degree  of  deflec¬ 
tion  depended  on  the  size  of  both  forces. 

Following  the  balance  scale  training,  the 
children  were  given  the  force  table  task  with 
the  plates  of  weights.  A  third  group  of  children 
received  only  the  force  table  task,  and  acted  as 
untrained  controls.  Pauen  and  Wilkening  argued 
that  an  effect  of  the  analogical  training  would 
be  shown  if  the  children  who  were  trained  with 
the  traditional  balance  scale  showed  a  greater 
tendency  to  use  the  one-force-only  rule  than  the 
control  group  children,  while  the  children  who 
were  trained  with  the  modified  balance  scale 
showed  a  greater  tendency  to  use  the  integra¬ 
tion  rule  than  the  control  group  children.  This 
was  exactly  the  pattern  that  they  found.  The 
children’s  responses  to  the  force  table  problem 
varied  systematically  with  the  solution  provid¬ 
ed  by  the  analogical  model.  These  results  sug¬ 
gest  that  the  children  were  using  spontaneous 
analogies  in  their  reasoning  about  physics,  just 
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as  we  have  seen  them  do  in  their  reasoning 

about  biology. 
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chological  processes,  neural  nets,  and  empiri¬ 
cal  criteria.  The  ranks  and  typical  concepts 
which  belong  to  them,  are:  Rank  0,  elemental 
association;  Rank  1,  content-specific  represen¬ 
tations  and  configural  associations;  Rank  2, 
unary  relations,  class  membership,  variable- 
constant  bindings;  Rank  3,  binary  relations,  pro¬ 
portional  analogies;  Rank  4,  ternary  relations, 
transitivity  and  hierarchical  classification;  Rank 
5,  quaternary  relations,  proportion  and  the  bal¬ 
ance  scale.  Rank  6,  quinary  relations.  Rank  0 
can  be  performed  by  2-layered  nets,  rank  1  by 
3-layered  nets,  and  ranks  2-6  by  tensor  prod¬ 
ucts  of  the  corresponding  number  of  vectors. 
All  animals  with  nervous  systems  perform  rank 
0,  vertebrates  perform  rank  1 ,  other  primates 
perform  rank  2-3,  but  ranks  4-6  are  uniquely 
human.  Rank  also  increases  with  age.  Implica¬ 
tions  of  this  model  are  developed  for  human 
reasoning  and  cognitive  development. 

In  this  paper  we  will  present  an  outline  of  a 
theory  that  provides  a  general  metric  for  cogni¬ 
tive  complexity,  and  specifies  properties  of 
higher  cognitive  processes  in  a  way  that  enables 
them  to  be  distinguished  systematically  from 
more  basic  cognitive  functions.  The  theory  dis¬ 
tinguishes  the  cognition  of  humans  from  other 
animals,  distinguishes  levels  of  cognitive  de¬ 
velopment,  and  accounts  for  processing  loads 
in  cognitive  tasks,  within  a  common  metric 


ABSTRACT 

It  is  proposed  that  models  based  on  pro¬ 
cessing  relations  capture  the  structure  sensitiv¬ 
ity  of  higher  cognitive  processes  while  they  can 
also  be  compared  with  more  basic  processes 
such  as  associations.  Relations  have  the  follow¬ 
ing  properties  that  are  not  shared  by  associa¬ 
tions:  there  is  an  explicit  symbol  for  each  rela¬ 
tional  instance,  allowing  it  to  be  manipulated, 
higher-order  relations  can  be  formed  that  have 
lower-order  relations  as  arguments,  given  any 
N-1  components  of  an  n-ary  relation  the  remain¬ 
ing  component  can  be  retrieved  (omni-direc¬ 
tional  access),  and  representation  of  relational 
instances  is  a  prerequisite  to  analogical  map¬ 
ping.  A  model  is  proposed  in  which  each  com¬ 
ponent  of  a  relational  instance  is  represented 
by  a  vector,  and  the  binding  is  represented  by 
computing  the  outer  product  of  the  vectors.  This 
architecture  has  been  used  to  model  analogy 
and  human  memory.  It  can  also  be  used  to 
model  structural  effects  on  both  similarity  and 
category  formation.  Computational  cost  in¬ 
creases  exponentially  with  representational 
rank,  defined  as  number  of  components  that  are 
bound  into  a  representation.  Thus  the  model 
provides  a  natural  explanation  for  processing 
capacity  limitations  in  humans  and  higher  ani¬ 
mals.  Each  rank  corresponds  to  a  class  of  psy- 
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based  on  structural  complexity.  The  levels  of 
complexity  are  related  systematically  both  to 
neural  net  architectures  and  to  empirical  crite¬ 
ria.  Analogy  has  a  central  role  in  this  theory, 
first  because  it  is  a  core  mechanism  in  higher 
cognition,  and  second  because  lower  cognitive 
processes  cannot  implement  analogy. 

Although  interest  in  analogy  dates  back  to 
near  the  beginning  of  scientific  psychology 
(Piaget,  1950;  Spearman,  1923)  understanding 
of  human  analogical  reasoning  accelerated  dra¬ 
matically  in  the  1980s  (Centner,  1983;  Gick  & 
Holyoak,  1983).  Analogy  is  a  natural  mecha¬ 
nism  for  human  reasoning,  but  we  will  suggest 
that  its  involvement  in  higher  cognition  might 
be  even  greater  than  previously  realised.  It  has 
proven  difficult  to  produce  effective  models  of 
human  reasoning  based  on  logical  inference 
rules.  Such  models  do  exist  (Braine,  1 978;  Rips, 
1989)  but  most  theorists  have  chosen  to  model 
reasoning  on  the  basis  of  alternative  psycho¬ 
logical  mechanisms  such  as  memory  retrieval 
(Kahneman  &  Tversky,  1973)  mental  models 
(Johnson-Laird,  1983;  Johnson-Laird  &  Byrne, 
1991)  or  pragmatic  reasoning  schemas  (Cheng 


&  Holyoak,  1985).  Analogy  can  play  a  role  in  a 
human  reasoning  and  is  also  entailed  in  some 
significant  ways  with  a  number  of  other  mod¬ 
els.  We  can  illustrate  this  using  pragmatic  rea¬ 
soning  schemas. 

Although  it  has  become  fashionable  to  in¬ 
terpret  pragmatic  reasoning  schemas  as  being 
specialised  fordcontic  reasoning,  they  may  be 
more  widely  applicable.  Consistent  with  this, 
we  will  use  the  definition  of  pragmatic  reason¬ 
ing  schemas  as  structures  of  general  validity  that 
arc  induced  from  ordinary  life  experience.  One 
type  of  pragmatic  reasoning  schema,  permis¬ 
sion,  is  known  to  improve  performance  on  the 
Wason  Selection  Task  (Cheng  &  Holyoak, 
1985).  In  this  task  participants  arc  given  four 
cards  containing  p,*^,  q,^and  asked  which  cards 
must  be  turned  over  to  test  the  rulep->q.  Anal¬ 
ogy  plays  a  central  role  here,  because  as  Figure 
1  shows,  the  elements  and  relations  presented 
in  the  WST  task  can  be  mapped  into  a  permis¬ 
sion  or  prediction  schema.  This  can  be  done  by 
application  of  the  principles  that  arc  incorpo¬ 
rated  in  contemporary  computational  models 
of  analogy  (Falkcnhaincr,  Forbus,  &  Centner, 
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1989;  Gray,  Halford,  Wilson,  &  Phillips,  1997; 
Hummel  &  Holyoak,  1997;  Mitchell  &  Hofs- 
tadter,  1990)  and  no  special  mechanism  is  re¬ 
quired. 

Possible  reason  why  induction  of  a  permis¬ 
sion  schema  improves  performance  is  that,  as 
Table  1  shows,  permission  is  isomorphic  to  the 
conditional.  Extending  this  argument,  a  possi¬ 
ble  reason  for  the  tendency  to  respond  in  terms 
of  the  biconditional  p  <->  q,  is  that  participants 
may  otherwise  interpret  the  rule  as  a  predic¬ 
tion.  As  Table  1  shows,  prediction  is  isomor¬ 
phic  to  the  biconditional.  This  argument  has 
been  presented  in  more  detail  elsewhere  (Hal¬ 
ford,  1993).  It  implies  that  the  importance  of 
permission  is  not  that  it  is  deontic,  but  that  it  is 
isomorphic  to  implication.  While  we  would  not 
suggest  that  this  argument  does  justice  to  the 
extensive  literature  on  either  the  Wason  Selec¬ 
tion  Task  or  pragmatic  reasoning  schemas,  it 
does  serve  to  illustrate  that  analogy  cdn  serve 
as  the  basic  mechanism  even  in  tasks  such  as 
WST  that  might  normally  be  considered  to  en¬ 
tail  logical  reasoning. 


ANALOGY,  RELATIONS  AND  HIGHER 
COGNITIVE  PROCESSES 

Although  there  are  big  differences  between 
contemporary  computational  models  of  analo¬ 
gy,  there  is  some  degree  of  consensus  about  the 
core  processes.  In  particular,  it  seems  clear  that 
analogy  is  a  matter  of  mapping  relations  or  re¬ 


lational  instances  between  two  representations. 
The  core  principles  seem  to  be;  the  elements  in 
one  structure,  the  base  are  mapped  uniquely  to 
the  elements  of  the  other  structure,  the  target, 
and;  if  a  predicate  P  in  the  base  is  mapped  to 
the  predicate  P’  of  the  target,  the  arguments  of 
P  are  mapped  to  the  arguments  of  P’.  The  rela¬ 
tional  instances  may  be  coded  in  the  input  (Gray 
et  al.,  1997;  Hummel  &  Holyoak,  1997,  Falk- 
enhainer,  1989  #1136)  or  they  may  be  con¬ 
structed  dynamically  during  the  running  of  the 
model  (Mitchell  &  Hofstadter,  1990)  but  a 
mapping  between  relational  instances  seems  to 
constitute  the  essence  of  analogy  in  most  mod¬ 
els.  It  seems  fair  to  say  that  an  organism  that 
could  not  represent  relations  or  relational  in¬ 
stances  could  not  perform  analogy.  If  we  ac¬ 
cept  that  analogy  is  one  of  the  core  processes 
in  higher  cognition,  then  ability  to  process  re¬ 
lations  and  relational  instances  is  also  likely  to 
be  important  in  higher  cognition.  This  is  really 
an  argument  for  the  importance  of  structure  in 
higher  cognition,  because  relations  are  the  es¬ 
sence  of  structure  (a  structure  is  a  set  on  which 
one  or  more  relations  is  defined). 

Our  next  step  is  to  consider  those  proper¬ 
ties  of  higher  cognitive  processes  on  which 
there  seems  to  be  reasonable  consensus.  One 
such  property  is  representation  of  structure,  to¬ 
gether  with  ability  to  operate  on  that  structur¬ 
al  representation.  This  is  generally  seen  as  the 
essence  of  higher  cognition.  The  central  role 
of  structure  in  higher  cognitive  processes  has 
been  recognised  historically  (Humphrey, 
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1951)  and  by  a  number  of  writers  in  this  cen¬ 
tury,  including  Gestaltists  (Wertheimer,  1945), 
Piagetians  (Piaget,  1950),  information  pro¬ 
cessing  theorists  (Anderson,  1983;  Hunt,  1962; 
Miller,  Galanter,  &  Pribram,  1960;  Newell, 
1990)  and  linguists  (Chomsky,  1980;  Fodor, 
1975).  One  role  of  analogy  is  to  form  map¬ 
pings  between  structures,  so  on  these  grounds 
also  analogy  might  be  considered  a  core  pro¬ 
cess  in  higher  cognition. 

There  is  also  reasonable  consensus  that 
higher  cognitive  processes  entail  variables, 
which  are  essential  to  the  generality  and  con¬ 
tent-independence  that  characterise  higher  cog¬ 
nitive  processes..  An  entire  generation  of  cog¬ 
nitive  models  are  based  on  rules,  a  distinguish¬ 
ing  characteristic  of  which  is  that  they  relate 
variables.  Production  rules  are  perhaps  the  most 
common  example  (Anderson,  1983;  Newell, 
1990)  and  production  systems  normally  have 
provision  for  variable  binding.  Smith,  Lang.sfon 
and  Nisbett  (1992)  make  a  case  for  logical  in¬ 
ference  rules  being  used  in  natural  reasoning. 
These  rules  relate  variables.  For  example, 
modus  ponens  is  a  logical  inference  rule  of  the 
form  ifp  then  q,  p  therefore  q,  where  p  and  q 
are  variables.  Pragmatic  reasoning  schemas 
(Cheng  &  Holyoak,  1985)  are  more  content- 
specific  than  abstract  logical  inference  rules, 
but  still  relate  variables.  Thus  the  permission 
schema  can  be  expressed  as;  to  perform  act  a, 
you  must  have  permission  /?. 

Analogies  can  simulate  variables  by  put¬ 
ting  instances  of  a  relation  in  correspondence 
with  each  other.  Consider  for  example  the  fol¬ 
lowing  relational  instances: 
larger(whale,fish), 
larger(horse,dog), 
larger(5,3). 

Each  relational  instance  has  two  roles  or 
slots,  one  filled  by  the  larger  entity  and  one  by 
the  smaller  entity  in  a  given  pair.  Because  each 
role  can  be  instantiated  in  a  variety  of  ways,  it 
effectively  functions  as  a  variable,  but  only  if 
the  arguments  are  in  correspondence.  It  would 
not  be  true  if  the  relational  instances  were  cross- 
mapped,  as  in  this  case: 


largeit  whale, fish) 

larger(dog,horse) 

Models  of  analogy  include  mechanisms  for 
ensuring  structural  correspondence.  Indeed  this 
is  a  core  process  in  analogy  models.  Therefore 
they  provide  a  mechanism  that  is  capable  of  at 
least  limited  processing  of  variables. 

Higher  cognitive  processes  are  widely  re¬ 
garded  as  incorporating  symbols,  even  though 
the  issue  has  become  complicated  by  the  de¬ 
bate  between  proponents  of  symbolic  and  con- 
nectionist  models.  Newell  argued  that  symbols 
and  a  system  that  operates  on  them  are  ncces- 
sar>'  for  intelligent  action  (sec  Newell,  1990, 
p.  1 70).  Fodor  and  Pylyshyn  (1988)  argue  that 
symbols  are  vital  to  cognition.  Smolensky 
(1988),  a  conneciionist  modeller,  docs  not 
deny  the  importance  of  symbols  per  se,  but 
seeks  to  explain  them  at  the  subsymbolic  lev¬ 
el,  rather  than  accepting  them  as  a  primary 
datum.  With  this  proviso,  there  docs  seem  to 
be  widespread  acceptance  of  the  importance 
of  symbols  in  higher  cognitive  processes. 

Analogical  reasoning  mechanisms  operate 
on  relations  that  are  symbolic  in  the  sense  that 
they  include  a  label  that  specifics  the  link  be¬ 
tween  the  entities  that  arc  related.  Thus  in  the 
instances  considered  above,  the  entities  in  the 
pairs  (whale, fish)  and  (horse, dog)  arc  linked 
by  the  relation  symbol  “larger”.  Mathemati¬ 
cally,  an  n-ary  relation  is  a  subset  of  the  carte¬ 
sian  product  of  n  sets,  but  the  sub.set  is  typi¬ 
cally  specified  by  a  label;  for  example,  >(.  . 
(3,2), . . ,  (5,1), . . ,).  The  existence  of  a  label 
and  an  ordering  over  relational  elements  (i.e., 
R(a,b)  is  not  the  same  as  R(b,a))  are  impor¬ 
tant  characteristics  that  distinguishes  relations 
from  other  psychological  structures  such  as 
associations,  as  we  will  argue  later.  We  will 
briefly  consider  some  further  properties  of 
higher  cognition. 

Compositionality  has  come  to  be  accept¬ 
ed  as  a  properly  of  higher  cognitive  processes 
since  the  work  of  Fodor  and  Pylyshyn  (1988). 
In  essence  it  means  that  the  components  of  a 
cognitive  representation  retain  their  identity 
when  they  are  composed  into  more  complex 
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representations,  and  both  the  components  and 
the  composites  are  semantically  evaluable.  As 
we  will  see,  there  are  cognitive  processes  such 
as  configural  association,  for  which  these  prop¬ 
erties  do  hot  hold. 

Systematicity  is  another  property  that  has 
been  accepted  as  important  in  higher  cognition 
since  Fodor  &  Pylyshyn  (1988)  although  it  too 
has  been  the  subject  of  some  controversy  (Ni- 
klasson  &  van  Gelder,  1994;  Van  Gelder  &  Ni- 
klasson,  1994).In  essence,  it  implies  generali¬ 
sation  to  all  logically  or  structurally  equivalent 
situations,  although  it  is  generally  accepted  that 
content  can  also  influence  performance,  inde¬ 
pendent  of  structure,  to  some  extent.  Analogy 
clearly  has  the  potential  to  be  a  core  mecha¬ 
nism  in  achieving  systematicity. 

Categories  are  another  property  of  higher 
cognition.  We  will  not  consider  this  complex 
topic  here,  except  to  say  that  categories  must 
entail  a  label  that  is  independent  of  content. 

Modifiability  on  line  is  a  property  of  high¬ 
er  cognitive  processes  that  has  been  highlight¬ 
ed  by  the  work  of  Clark  and  Karmiloff-Smith 
(1993).  Higher  cognitive  processes  should  also 
be  productive  or  generative,  in  the  sense  that 
they  can  produce  or  comprehend  new  sentenc¬ 
es,  can  generate  new  representations,  and  make 
new  inferences.  This  is  true  of  both  human  and 
nonhuman  primates,  because  apes  show  some 
inventiveness  (Kohler,  1957)  and  ability  to  draw 
inferences  (Tomasello  &  Call,  1997).  Further¬ 
more,  though  we  will  not  pursue  the  question 
here,  the  approach  we  have  adopted  can  model 
some  limited  forfns  of  creativity  (Halford, 
Wiles,  Humphreys,  &  Wilson,  1993). 

We  do  not  include  conscious  awareness  and 
language  as  criterial  properties  of  higher  cog¬ 
nition.  Awareness  has  proven  to  be  a  difficult 
criterion  to  use,  as  the  implicit  learning  litera¬ 
ture  has  shown  (Neal  &  Hesketh,  1997).  As  we 
wish  to  include  some  nonlinguistic,  nonhuman 
species  as  having  at  least  some  forms  of  higher 
cognition,  then  language  cannot  be  included 
either.  We  see  conscious  awareness  and  lan¬ 
guage  as  correlated  rather  than  criterial  proper¬ 
ties  of  higher  cognition. 


We  want  to  suggest  that  relational  process¬ 
ing  can  capture  the  properties  of  higher  cogni¬ 
tion.  Relations  are  preferable  to  rules,  which 
have  been  used  to  model  higher  cognitive  pro¬ 
cesses  and  to  distinguish  them  from  basic  pro¬ 
cesses  that  have  been  characterised  as  associa¬ 
tive  (Sloman,  1996)  or  instance-based  (Smith 
et  al.,  1992).  Some  cognitive  representations 
such  as  loves(John,Mary)  or 
contains(cup,drink)  are  not  rules,  but  can  be 
expressed  as  relations.  The  concept  of  n-ary 
relation  is  general  enough  to  express  any  rule, 
it  has  the  advantage  of  a  precise  mathematical 
definition,  and  effects  of  relational  complexity 
on  processing  load  are  known  -a {Blank  or  a  = 
BBS,  Which  one  b??} (Halford  et  al.,  in  press). 
Relations  are  increasingly  being  utilised  as  the 
basis  for  models  of  higher  cognitive  processes. 
In  addition  to  analogy,  the  importance  of  rela¬ 
tional  processing  has  been  recognised  in  simi¬ 
larity  (Markman  &  Centner,  1 993),  induction 
(Lassaline,  1996),  and  categorisation  (Medin, 
1989).  Mental  model  theory,  which  can  now 
account  for  a  wide  range  of  phenomena  in  hu¬ 
man  reasoning  (Centner  &  Stevens,  1983;  Hal¬ 
ford,  1993;  Johnson-Laird,  1983;  Polk  &  New¬ 
ell,  1995),  is  based  on  representation  of  rela¬ 
tions  between  entities.  Phillips,  Halford,  and 
Wilson  (1995)  have  argued  that  the  associative- 
relational  distinction  can  subsume  the  implic¬ 
it-explicit  distinction  of  Clark  and  Karmiloff- 
Smith  (1993).  Propositions,  which  are  the  core 
of  some  models  of  higher  cognitive  processes, 
can  be  treated  as  relational  instances  (Halford, 
Wilson,  &  Phillips,  in  press),  section  2.2.2).  For 
example  the  proposition  loves(Joe, Jenny)  is  a 
relational  instance. 

Another  big  advantage  of  relations  is  that  they 
can  be  compared  directly  with  associations.  The 
importance  of  this  is  that  association  has  been 
accepted  as  a  fundamental  process  in  psychology 
virtually  throughout  the  history  of  the  discipline, 
and  even  many  contemporary  models  incorporate 
it  in  one  form  or  another.  Therefore  it  is  a  disad¬ 
vantage  for  associative  and  cognitive  models  to 
exist  in  conceptual  worlds  that  do  not  comniuni- 
cate.  We  will  suggest  that  basic  processes,  such 
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as  association,  and  higher  cognitive  process,  which 
we  identify  with  relations,  can  be  incorporated 
into  an  overarching  theory  that  integrates  psycho¬ 
logical  processes  at  all  levels.  First  however  we 
will  consider  the  properties  of  associative  and  re¬ 
lational  knowledge  in  more  detail. 

ASSOCIATION 

By  contrast  with  higher  cognition,  associ¬ 
ation  is  not  seen  as  inherently  structural  (Fodor 
&  Pylyshyn,  1988;  Humphrey,  1951).  It  dif¬ 
fers  from  relational  knowledge  in  a  number 
of  critical  ways,  one  of  which  is  that  it  is  not 
symbolic.  To  illustrate,  let  us  consider  two 
commonplace  relations,  between  cup  and 
drink  and  between  cup  and  saucer:  i.e. 
contains(cup, drink)  and  placed- 
on(cup, saucer).  The  relation-symbols  (or  pred¬ 
icates)  contains  and placed-on  specify  the  type 
of  link  represented,  containment  or  superpo¬ 
sition.  Contrast  this  with  associations;  cup  is 
associated  with  drink,  and  cup  is  associated 
with  saucer.  The  associations  per  se  do  not 
specify  the  relations  between  cup  and  drink, 
or  between  cup  and  saucer,  nor  do  these  asso¬ 
ciations  per  se  capture  the  fact  that  the  rela¬ 
tions  are  quite  different.  It  is  easy  to  overlook 
this  because  we  know  that  a  cup  contains  a 
drink  and  that  a  cup  is  placed  on  a  saucer,  so 
we  tend  to  see  this  information  in  the  associa¬ 
tive  link.  The  associative  link  is  causal  but 
does  not  capture  the  structure  (Fodor  &  Pyly¬ 
shyn,  1988).  Associative  links  are  unlabelled 
and  all  of  the  same  kind,  differing  only  in 
strength  (Humphrey,  1951).  The  need  for  la¬ 
belled  links  has  been  recognised  however  in 
models  of  higher  cognitive  structures  such  as 
propositional  networks,  in  which  links  be¬ 
tween  nodes  carry  labels  such  as  “agent”,  “ob¬ 
ject”,  “location”.  An  explicit  symbol  fora  link 
therefore  appears  to  be  a  property  that  distin¬ 
guishes  relational  from  associative  processes. 
Our  aim  now  is  to  define  the  properties  of  re¬ 
lational  processes  so  that  they  capture  the  es¬ 
sence  of  higher  cognition  and  can  be  compared 
directly  with  association. 


PROPERTIES  OF  RELATIONAL 
PROCESSES 

A  relation  that  relates  n  entities,  or  n-aiy 
relation  is  a  subset  of  the  cartesian  product  of 
n  sets:  i.e.  R(a,,a^,...,aJ  is  a  subset  of  S^x  S^x... 
X  S^.  A  relation  is  identified  by  the  relation  sym¬ 
bol,  R,  and  the  entities  by  argument  symbols, 
aj,a^,...,a^.  For  example  the  relation  “larger*’ 
identifies  a  specific  subset  of  a  cartesian  prod¬ 
uct,  that  subset  in  which  the  first  entity  is  al¬ 
ways  larger  than  the  second;  i.e.  a,  >  a^.  There 
must  be  a  binding  between  entities  and  argu¬ 
ments  which  preserv'es  the  truth  of  the  relation; 
thus  contains(cup, drink)  is  true  but 
contains(drink,cup)  is  not. 

Symbotisatlon,  or  an  explicit  label  speci¬ 
fying  the  link,  is  a  property  of  relations,  but  not 
of  associations. 

Higher-order  relations  have  lower-order 
relations  as  arguments;  e.g.  in  causefshout- 
at(Tom,John),  hit(John,Tom))  cause  is  a  high¬ 
er-order  relation,  with  shout-at(Tom,John)  and 
hit(John,Tom)  as  arguments. 

Systematicity  means  that  relations  imply 
other  relations,  and  can  be  captured  by  higher- 
order  relations;  e.g.  >(a,b)  implies  <(b,a),  can 
be  written  as  the  higher-order  relation 
implies(>(a,b),<(b,a)). 

Association  docs  not  share  these  proper¬ 
ties.  Associations  can  be  chained,  so  that  the 
output  of  one  association  is  the  input  to  anoth¬ 
er:  Ej  . 

verge,  so  that  E,  and  E^  elicit  E^,  or  diverge,  so 
that  E,  elicits  E^  and  E^  However  associations 
are  not  identified  by  a  symbol,  and  the  associa¬ 
tive  link  per  se  cannot  be  an  argument  to  an¬ 
other  association.  Therefore  the  recursive,  hi¬ 
erarchical  structures  that  can  be  formed  using 
higher-order  relations  do  not  appear  to  be  pos¬ 
sible  with  associations. 

Composftlonality  means  that  the  compo¬ 
nents  of  the  relation,  symbol  and  arguments, 
retain  their  identity  when  bound  into  a  struc¬ 
ture;  e.g.  in  larger(whale, dolphin),  the  compo¬ 
nents  *'Iargef’,  “whale”  and  “dolphin”  retain 
their  identity  when  bound  into  the  relation.  This 
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I  is  not  inherent  in  association,  as  we  will  see 
when  we  consider  configural  associations. 

I  Modifiabilityby  strategic  processes,  with- 

i  out  information  input,  is  possible  for  relations, 
whereas  associations  are  modified  incremen- 
I  tally  on  the  basis  of  experience. 

Omni-directional  access  means  that,  giv- 
l  en  all  but  one  of  the  components  of  a  relational 
^  instance,  we  can  access  (i .e.  retrieve)  the  remain- 
!  ing  component.  For  example,  given  the  relational 

instance  mother-of(woman,child),  and  given 
mother-of(woman,?)  we  can  access  “child”, 
whereas  given  mother-of(?, child)  we  can  access 
“woman”,  and  given  ?( woman, child)  we  can 
access  “mother-of Although  backward  associ¬ 
ation  may  be  possible,  omni-directional  access 
does  not  appear  to  be  inherent  in  association. 

Complexity  can  be  defined  by  the  “arity” 
or  number  of  arguments  of  a  relation  (Halford 
et  al.,  1994;  Halford  et  al.,  in  press).  Each  argu¬ 
ment  corresponds  to  a  source  of  variation  or 
dimension,  so  an  n-nary  relation  is  a  set  of 
points  in  n-dimensional  space.  Dimensionality 
^  is  related  to  processing  load.  Capacity  is  limit¬ 
ed  by  the  number  of  dimensions  (or  number  of 
interacting  variables)  that  can  be  processed  in 
parallel .  Data  in  the  literature,  and  from  our  own 
laboratory,  indicates  quaternary  relations  (Rank 
5)  are  the  most  complex  that  can  be  processed 
in  parallel  by  most  humans.  Concepts  too  com¬ 
plex  to  be  processed  in  parallel  are  handled  by 
segmentation  (decomposition  into  smaller  seg¬ 
ments  that  can  be  processed  serially)  and  con- 
,  ceptual  chunking  (recoding  representations  into 

lower  rank,  but  at  the  cost  of  making  some  re¬ 
lations  inaccessible).  For  example,  velocity  = 
distance/time,  is  a  ternary  relation,  and  is  Rank 
4,  but  can  be  recoded  to  rank  2,  a  binding  be¬ 
tween  a  variable  and  a  constant  (Halford  et  al., 
in  press),  Section  3.4.1).  Difficulty  can  vary 
because  of  factors  other  than  capacity,  includ¬ 
ing  declarative  and  procedural  knowledge  and 
amount  of  iteration  (e.g,  constructing  a  5-term 
series  from  premises  a>b,  b>c,  od,  d>e  re¬ 
quires  the  integration  process  to  be  iterated  3 
times;  a>b,  b>c  yields  a,b,c,  then  this  is  inte¬ 
grated  with  od  to  yield  a,b,c,d,  etc.). 


In  the  next  section  we  argue  that  each  lev¬ 
el  of  cognitive  functioning  can  be  assigned 
to  an  equivalence  class  of  equal  structural 
complexity,  and  that  the  classes  can  be  or¬ 
dered  according  to  their  complexity.  They  are 
ordered  according  to  representational  rank, 
defined  as  the  number  of  components  in  cog¬ 
nitive  representations,  given  that  the  compo¬ 
nents  retain  their  identity  when  bound  into 
more  complex  representations.  An  important 
feature  of  this  idea  is  that  the  ranks  corre¬ 
spond  across  the  three  domains  of  psycho¬ 
logical  process,  neural  net  structure,  and 
empirical  observation.  Each  rank  corresponds 
to  a  class  of  neural  net  architectures  and  can 
be  identified  by  specific  empirical  criteria. 
It  is  an  extension  of  a  theory  that  defines  pro¬ 
cessing  capacity  in  terms  of  relational  com¬ 
plexity  (Halford  et  al.,  in  press). 

REPRESENTATIONAL  RANK 

Representational  rank  corresponds  to  the 
number  of  components  of  a  representation,  giv¬ 
en  that  the  components  retain  their  identity 
when  bound  in  a  more  complex  representation. 
The  metric  is  shown  in  Figure  2,  together  with 
corresponding  psychological  processes  and 
neural  net  architectures.  The  metric  combines 
relational  complexity  with  two  nonstructural 
levels,  elemental  and  configural  association, 
enabling  the  basic  properties  of  all  levels  of 
cognition  to  be  defined  within  a  single  system. 
Rank  =  n-\-\  where  /t  is  the  dimensionality  or 
arity  of  a  relation.  We  will  now  give  an  over¬ 
view  of  the  ranks. 

Figure  2.Rank  0  corresponds  to  Elemen¬ 
tal  associations,  which  comprise  links  between 
pairs  of  entities:  Ej  — >  E^ 

They  are  Rank  0  because  there  is  no  repre¬ 
sentation  other  than  input  and  output,  and  they 
can  be  implemented  by  2-layered  nets.  In  prin¬ 
ciple  Rank  0  can  be  assessed  by  any  associa¬ 
tive  learning  test,  and  because  ability  to  per¬ 
form  at  this  level  is  not  in  question  for  verte¬ 
brates,  or  even  for  most  invertebrates,  no  spe¬ 
cial  assessment  is  intended. 
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Rank  1  corresponds  to  Conffgural  asso¬ 
ciations,  in  which  one  cue  is  modified  by  an¬ 
other.  They  have  the  form:  E^  An  ex¬ 

ample  is  conditional  discrimination,  shown  in 
Table  2.  This  cannot  be  acquired  through  cle- 


Cognltlve  Process  Neural  net  Representational 
specification  Rank 


elemental  association 


B2 


W' 


configural  association*^' 


0 


1 


unary  relation 


binary  relation 


ternary  relation 


quaternary  relation 


quinary  relation 


2 


3 


4 


5 


6 


Figure  2.  Ranks  0-6,  with  schematic  neuraJ  nets,  input 
and  output  layers  are  omitted  for  Ranks  2-6. 


mental  association,  because  of  associative  in¬ 
terference  (each  element,  colour  or  shape,  is 
equally  associated  with  each  outcome).  They 
can  be  learned  by  fusing  or  "chunking”  ele¬ 
ments  into  a  configuration  such  as  "black/tri¬ 
angle”.  This  avoids  associative  interference  but 
at  the  cost  that  the  components  lose  their  iden¬ 
tity,  (e.g.  "triangle”  is  not  the  same  in  "black/ 
triangle”  as  in  "white/triangle”)  so  the  struc¬ 
ture  of  the  task  is  not  represented.  Thus  the  rep¬ 
resentation  is  holistic  and  nonstructural.  Con¬ 
figural  learning  cannot  be  implemented  by  2- 
layered  nets  (Minsky  &  Papert,  1969;  note  that 
conditional  discrimination  is  isomorphic  to 
cxclusive-OR).  They  can  be  implemented  with 
three-layered  nets,  by  using  units  in  the  hidden 
layer  to  represent  configurations  of  features 
such  as  "black&triangle  "  (Schmajuk  &  DiCar- 
lo,  1992). 

Ranks  2-6  are  structural,  and  complexity 
increases  with  rank.  We  will  consider  the  main 
properties  of  each  rank. 

Rank  2  corresponds  to  unary  relations 
which  are  a  binding  between  a  relation  symbol 
and  an  argument  symbol.  An  example  would 
be  the  proposition  happy(John).  Indicators  of 
Rank  2  include  symbolic  representation  of  cat¬ 
egories  and  understanding  word  reference. 

Rank  3  corresponds  to  binary  relations, 
which  represent  common  states  and  actions  in 
the  world,  such  as  !arger(whale,dolphin),  or 
!oves(Joe, Jenny). 

Rank  4  corresponds  to  Ternary  relations 
such  as  "love-triangle”,  which  is  a  relation  be¬ 
tween  three  people.  They  can  be  inteq^reted  as 
bivariate  functions,  and  binary  operations.  For 
example,  the  binary  operation  of  arithmetic 
addition  consists  of  the  set  of  ordered  triples  of 
+{. .  ,(3,2,5), . .  ,(5,3,8) . )  and  is  a  terna¬ 

ry  relation.  Many  cognitive  tasks  that  cause  dif¬ 
ficulty  for  young  children,  including  transitivi¬ 
ty  and  class  inclusion,  are  ternary  relations  (Hal¬ 
ford,  1993;  Halford  ct  a!.,  in  press). 

Rank  5  corresponds  to  quaternary  rela¬ 
tions.  Proportion,  a/b  =  c/d,  is  a  quaternary  re¬ 
lation.  Comparison  of  moments  on  the  balance 
scale  (Siegler,  1981)  is  another  example. 
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Table  2,  Conditional  discrimination,  with  isomorphic  transfer  task. 


Original  task  Transfer  task 

black 

triangle  -> 

R+ 

green 

circle  —> 

R+ 

black 

square 

R- 

blue 

cross  — » 

? 

white 

triangle 

R- 

green 

cross 

white 

square  — > 

R+ 

blue 

circle  —> 

Rank  6  corresponds  to  quinary  relations. 
Some  complex  reasoning  tasks,  such  as  cate¬ 
gorical  syllogisms  and  meta  logical  tasks,  re¬ 
quire  Rank  6, 

NEURAL  NET  MODELING  OF 

REPRESENTATIONAL  RANKS 

Neural  nets  can  be  rank-ordered  according 
to  the  structural  complexity  of  their  internal 
representations  (excluding  input  and  output  lay¬ 
ers),  and  this  rank  ordering  corresponds  both  to 
classes  of  psychological  processes  and  to  em¬ 
pirical  criteria.  Two-layered  nets  have  no  in¬ 
ternal  representation.  Three-layered  nets  con¬ 
tain  a  representation  that  is  computed  from  the 
input.  While  allowing  that  there  are  many  vari¬ 
ations,  and  potential  for  development,  the  rep¬ 
resentation  in  a  typical  three-layered  net  is  “ho¬ 
listic”  and  is  not  structured  in  a  way  that  meets 
the  criteria  for  representation  of  relations. 
Three-layered  nets  can  represent  content-spe¬ 
cific  information  and  can  form  prototypes 
(Quinn  &  Johnson,  1997)  but  they  lack  com- 
positionality  and  systematicity  (Fodor  &  Pyly- 
shyn,  1988;  Phillips,  1994).  They  can  only 
mediate  transfer  based  on  similar  content  (Mar¬ 
cus,  submitted)  and  not  between  isomorphic 
structures  with  different  contents  (Phillips  & 
Halford,  1997). 

Nets  that  model  higher  cognitive  process¬ 
es  should  implement  the  properties  of  relation¬ 
al  processes  defined  above.  There  are  currently 
a  number  of  competing  models  that  can  meet 
these  criteria,  discussed  by  (Halford  et  al.,  in 
press).  In  the  model  we  will  present  here,  each 
relational  instance  is  represented  as  a  unique 


n-tuple,  by  representing  bindings  between  re¬ 
lation  symbol  and  arguments  as  outer  products. 
Thus  to  represent  loves(Joe, Jenny),  each  com¬ 
ponent,  loves,  Joe  and  Jenny  is  represented  as 
a  vector,  and  the  binding  is  represented  as  the 
outer  product  of  these  vectors.  The  outer  prod¬ 
uct  corresponds  to  the  binding  units,  shown  for 
Rank  2  in  Figure  1  but  omitted  for  simplicity  at 
higher  ranks.  Other  instances  of  loves  are  rep¬ 
resented  in  the  same  way,  and  can  be  summed 
to  form  a  tensor  product  which  represents  the 
relation  loves  (Halford  et  al.,  in  press,  section 
4. 1.1. 2).  Thus  loves(Joe, Jenny)  and 
loves(Tom, Wendy)  are  represented  as: 

+  V,„„»®^T.,™®  Vwencly 

Neural  net  representations  of  relations  from 
unary  to  quinary  are  shown  schematically  in  the 
rightmost  column  of  Figure  2.  An  n-aiy  relation 
is  represented  by  the  rank-n  tensor,  0  V^,  0, 

. . ,  0  A  unary  relation  such  as  happy(John) 
is  represented  by  the  outer  product  of  vectors 
representing  “happy”  and  “John”: 

In  Figure  2  the  two  vectors  are  bound  by  a  set  of 
connections  to  a  matrix  of  binding  units.  Rank  2 
is  the  lowest  structural  level,  but  the  transition 
from  Rank  1  to  Rank  2  can  be  envisaged  by  imag¬ 
ining  the  hidden  layer  at  Rank  1  (Figure  2)  be¬ 
ing  divided  into  two  components  which  are  then 
connected  so  as  to  form  a  matrix  as  shown  for 
Rank  2.  More  complex  relations  are  represented 
by  tensor  products  of  higher  rank.  A  binaiy  rela¬ 
tion  is  represented  by  0 
There  is  one  component  representing  the  sym¬ 
bol  and  one  for  each  argument,  so  the  represen¬ 
tation  of  an  n-ary  relation  has  «+!  components. 
The  components  retain  their  identity,  and  the  rep¬ 
resentations  have  the  compositionality  proper- 
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ty.  The  model  provides  a  natural  explanation  for 
empirical  observations  that  cognitive  processing 
load  increases  with  relational  complexity  (Hal¬ 
ford  et  al.,  in  press,  Section  5.).  Representation 
of  a  relation  of  rank  r  with  m  units  in  each  vec¬ 
tor,  requires  m!"  bindings  units.  The  model  im¬ 
plements  all  properties  of  relational  knowledge 
(Halford  et  al.,  in  press,  Section  4.2)  and  is  more 
efficient  than  models  based  on  role-filler  bind¬ 
ings  for  data  bases  in  which  relational  instances 
are  superimposed  in  the  sense  that  role-filler 
bindings  require  r  units  per  relational  instance, 
where  symbol -argument  bindings  require  1  unit 
per  instance  (Halford  et  al.,  in  press,  sections 
2.2. 1.2  and  4.1.3). 

ASSOCIATIONS,  RELATIONS  AND 
ANALOGY 

It  follows  from  this  analysis  that  higher  cog¬ 
nitive  processes  differ  from  a.ssociative  process¬ 
es  in  that  the  former  entail  representation  and 
processing  of  structure.  A  task  is  cognitive  to 
the  extent  that  it  entails  a  representation  and 
processing  of  the  structure  of  the  task  or  situa¬ 
tion.  The  repre.sentation  should  have  the  prop¬ 
erties  identified  above.  Representation  of  struc¬ 
ture  (relations)  is  essential  to  analogy,  and  this 
principle  can  be  used  to  devise  what  is  proba¬ 
bly  the  most  objective  and  straightforward  test 
for  cognitive  processes. 

The  essential  idea  is  that  if  the  structure  of 
a  task  is  learned,  it  can  be  transferred  to  iso- 
morphs  using  analogical  mapping,  and  un¬ 
known  items  in  the  new  task  can  be  predicted. 
This  principle  has  been  applied  successfully 
with  tasks  based  on  mathematical  groups  (Hal¬ 
ford,  Bain,  Maybery,  &  Andrews,  in  press)  but 
can  be  easily  illustrated  with  the  conditional 
discrimination  task  summarised  in  Table  2. 
Suppose  someone  has  learned  the  original  task. 
While  this  can  be  done  by  configural  associa¬ 
tion,  as  noted  above,  configural  discrimination 
does  not  lead  to  a  representation  of  structure 
because  the  elements  lose  their  identity.  How¬ 
ever  the  task  can  also  be  learned  by  acquiring  a 
representation  of  structure.  The  two  modes  of 


learning  can  be  distinguished  because  only  rep¬ 
resentation  of  structure  enables  transfer  to  iso- 
morphs  with  prediction  of  new  items.  Notice 
that,  in  the  transfer  task  in  Table  2,  once  the 
first  item  is  known  and  is  mapped  into  the  struc¬ 
ture,  the  remaining  three  items  can  be  easily 
predicted,  irrespective  of  order  of  presentation. 

Prediction  of  unknown  items  in  an  isomor¬ 
phic  task  in  this  way  requires  analogical  map¬ 
ping,  which  in  turn  requires  representation  of 
structure.  It  is  not  possible  if  the  task  has  been 
learned  by  configural  association.  Therefore 
transfer  between  isomorphs,  with  prediction  of 
unseen  items  is  a  clearcut  and  objective  mea¬ 
sure  of  structural  processing.  It  is  a  good  way 
to  assess  higher  cognitive  processes.  Notice  too 
that  it  does  not  impose  any  extraneous  task  de¬ 
mands.  The  isomorphic  task  is  assessed  by  the 
same  procedure  as  the  original  task,  and  stnic- 
lure  processing  can  be  assessed  by  the  number 
of  correct  items  on  the  first  trial  of  a  new  prob¬ 
lem.  It  is  not  necessary  to  ask  participants  to 
describe  the  structure  or  to  define  rules,  both 
of  which  impose  an  additional  demand  for  ar¬ 
ticulation.  We  have  been  able  to  use  this  meth¬ 
odology  successfully  (Halford,  1980;  Halford, 
Bain,  et  al.,  in  press;  Halford  &  Wilson,  1980) 
and  have  found  that  was  related  in  a  systematic 
way  to  other  criteria. 

CATEGORIES,  STRUCTURE  AND 
SIMILARITY 

Although  natural  categories  can  be  based 
on  prototypes,  prototypes  do  not  represent  struc¬ 
ture  (Medin,  1989).  TTiis  problem  can  be  over¬ 
come  by  forming  prototypes  based  on  relation¬ 
al  instances.  Relational  instances  such  asLives- 
in  (chair,  living  room),  Lives-in  (vase,  living 
room),  Lives-in(couch,  living  room)  can  be  rep¬ 
resented  as  outer  products  of  vectors  and  su¬ 
perimposed  on  a  tensor  product.  The  superim¬ 
posed  representation  automatically  averages 
features  of  the  relational  instances  and  corre¬ 
sponds  to  a  prototype  of  living  room  furniture, 
but  it  also  incorporates  structure  in  the  form  of 
propositional  information. 
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Similarity  depends  on  more  than  common 
features,  and  is  influenced  by  structure.  Forex- 
ample  grey  hair  is  rated  more  similar  to  white 
hair  than  to  black  hair,  whereas  grey  clouds  are 
more  similar  to  black  clouds  than  to  white 
clouds,  because  of  our  intuitive  theories  of  age¬ 
ing  and  weather  respectively  (Medin,  1989). 
Our  model  can  handle  similarity  based  both  on 
elements  and  structure. 

Element  similarity  can  be  assessed  by  com¬ 
puting  the  dot  (inner)  products  of  vectors  rep¬ 
resenting  two  elements.  If  “desk”,  “chair”  and 
“vase”  were  coded  by  vectors  representing  sets 
of  features,  the  dot  products  of  vectors  repre¬ 
senting  “desk”  and  “chair”  would  be  higher  than 
dot  products  of  vectors  representing  “desk”  and 
“vase”,  reflecting  greater  similarity  in  the 
former  pair. 

Structural  similarity  can  be  handled  by 
computing  dot  products  of  tensor  products. 
The  propositions  feeds(soup-kitchen,woman) 
and  feeds( woman, squirrel)  have  low  similari¬ 
ty  because  “woman”  occupies  different  roles. 
If  we  represent  the  propositions  respectively 
as:  V,  .  0v  ... .  0v  and  V-  ,  0v  0v 

,  the  dot  products  of  these  tensors  will  have 
a  low  value  (expected  value  is  zero  with  or¬ 
thonormal  vectors,  low  with  sparse  random 
vectors).  This  reflects  the  relational  context, 
because  woman  is  bound  to  soup-kitchen  in 
one  case  and  squirrel  in  the  other.  However 
cases  such  as  feeds(man, woman)  and 
feeds(woman,man)  are  distinguished  solely  by 
the  roles  occupied  by  entities  “woman”  and 
“man”.  We  represent  these  in  analogous  fash¬ 
ion  as  V,,  .  0v  0v  and  v,  ^  0v  0v- 

Dot  products  of  these  vectors  will  again 
be  low,  reflecting  “man”  and  “woman”  being 
in  different  roles.  This  occurs  because  dot 
products  are  computed  so  as  to  respect  struc¬ 
tural  alignment  (the  elements  of  v^^^  are  mul¬ 
tiplied  by  the  elements  of  v  ,  and  vice- 
verse,  giving  the  dot  product  an  expected  val¬ 
ue  of  zero  with  orthonormal  vectors,  or  a  low 
value  with  sparse  random  vectors).  This  illus¬ 
trates  the  sensitivity  of  the  model  to 
structural  alignment. 


RELATIONAL  CONTEXT  SIMILARITY 

The  similarity  of  two  items  can  be  based 
on  the  degree  to  which  they  are  used  in  the  same 
relational  context.  For  example,  in  the  relational 
domain  constmcted  around  the  items  chair,  desk 
and  vase  detailed  above,  chair  and  desk  would 
achieve  a  high  similarity  as  they  both  occur  fre¬ 
quently  in  the  same  relational  context  (ie. 
Made_of(chair,  wood)  and  Made_of(desk, 
wood),  Stands_on(chair,  floor)  and 
Stands_on(desk,  floor)).  Chair  and  vase,  how¬ 
ever  would  achieve  a  lower  similarity  as  they 
occur  less  frequently  in  the  same  relational  con¬ 
text.  Furthermore  “woman”  in  feeds(soup- 
kitchen, woman)  is  dissimilar  to  “woman”  in 
feeds(woman,squirrel)  because  the  relational 
contexts  are  different,  “soup-kitchen”  in  one 
case  and  “squirrel”  in  the  other. 

The  relational  context  similarity  of  two 
items,  a  and  h  is  computed  as  a  normalised  dot 
product  of  the  rank  2  tensors  retrieved  from 
computing  the  dot  product  of  each  item’s  vec¬ 
tor  against  an  appropriate  dimension  of  the  rank 
3  tensor  storing  the  relations’ .  This  can  be  ap¬ 
plied  to  the  hair-colour  and  cloud-colour  ex¬ 
amples  above.  We  will  represent  a  naive  theo¬ 
ry  of  ageing  by  propositions  such  as  old-peo- 
ple-have(hair,grey),old-people- 
have(hair, white), young-people- 
have(hair,black),young-people- 
have(hair,brown)  etc.  These  propositions  can 
be  superimposed  on  a  tensor  product  represen¬ 
tation.  If  we  query  this  representation  with 
“grey”  we  retrieve  “oId-people-have(hair,_)”. 
If  we  query  it  with  “white”  we  retrieve  “old- 
people-have(hair,_)”.  The  dot  products  of  these 
tensors  will  be  high,  reflecting  high  similarity. 
However  if  we  query  the  representation  with 
“black”  we  retrieve  “young-people-have(hair,_) 
and  the  dot  product  of  this  with  “old-people- 
have(hair,_)”  is  low. 

By  contrast,  our  knowledge  of  weather  is 
represented  by  propositions 

threatening(clouds,grey), 
threatening(clouds, black), 
nonthreatening(clouds, white)  etc.  Querying 


67 


Graeme  S.  Halford 


with  “grey”  and  “black”  yields 
threatening(clouds,_)  in  both  cases,  with  high 
dot  products  representing  high  similarity.  Que¬ 
rying  with  “white”  yields 
nonthreatening(clouds,_)  which  is  dissimilar  to 
threatening(clouds,__).  Thus  the  model  repre¬ 
sents  naive  theories  as  sets  of  propositions  cod¬ 
ed  in  a  tensor  product.  Relational  context,  as 
defined  above,  accounts  for  the  effect  of  naive 
theories  on  similarity. 

Representational  ranks  arc  really  points  on 
a  continuum,  and  limits  on  processing  capacity 
are  soft,  so  performance  declines  gracefully  as 
the  rank  demanded  by  a  task  increases.  It  is  pro¬ 
posed  to  model  performances  of  intermediate 
rank,  using  the  graceful  degradation  and  grace¬ 
ful  saturation  properties  of  tensor  products 
(Wilson  &  Halford,  1994). 

EMPIRICAL  INDICATORS  OF  RANKS 

Each  rank  has  a  unique  set  of  empirical  in¬ 
dicators.  We  will  consider  the  main  indicators 
for  each  rank. 

Rank  0  is  indicated  by  elemental  associa¬ 
tion.  Since  this  is  evidently  universal  to  all  an¬ 
imals  with  nervous  systems,  no  special  predic¬ 
tions  are  made. 

Rank  1  is  best  assessed  by  conditional 
discrimination.  It  is  indicated  in  general  by 
tasks  that  require  content-specific  represen¬ 
tations.  Representation  of  vanished  objects 
and  prototype  formation  both  entail  this  re¬ 
quirement,  and  are  performed  by  infants  3-6 
months  (Baillargeon,  1987).  Consequently 
the  theory  predicts  that  with  suitable  testing 
and  training  techniques,  infants  of  this  age 
can  acquire  conditional  discrimination.  The 
significance  of  this  can  be  seen  from  the  fact 
that  in  the  past  children  under  five  years  have 
had  great  difficulty  with  this  task  (Rudy, 
1991).  Two  further  predictions  follow.  The 
first  is  that  transfer  to  isomorphs  of  condi¬ 
tional  discrimination  will  not  be  possible 
until  a  median  age  of  five  years.  The  second 
is  that  formation  acquisition  of  conditional 
discrimination  will  be  related  to  representa¬ 


tion  of  vanished  objects  as  assessed  by  Bail- 
largcon  (1987)  and  to  prototype  formation. 

Rank  2  entails  a  relation -symbol  that  is  in¬ 
dependent  of  the  entity  to  which  it  is  bound, 
and  is  the  simplest  symbolic  representation. 
Tasks  that  require  this  level  of  structure  include: 

Explicit  category  membership,  such  as 
dog(Rover),  where  the  category  labcl\fr)g  is  rep¬ 
resented  independently  of  the  entity  to  which 
It  is  bound.  Rover.  As  with  all  relations,  the  ar¬ 
gument  slot  functions  as  a  variable,  and  can  be 
instantiated  in  a  variety  of  ways  such  as 
dogfFido),  dogfPenny)  etc.  Representation  of 
explicit  categories,  in  which  there  is  a  binding 
between  a  category  symbol  and  instances  of  the 
category,  seems  to  occur  at  approximately  one 
year  (Gershkoff-Stowe,  Thai,  Smith,  &  Namy, 
1997;  Sugarman,  1982). 

Inferences  about  numcrosity  based  on  cat¬ 
egory  membership  Xu  and  Carey  (1996). 

Word  comprehension,  or  understanding  that 
words  function  as  symbols  for  their  referents. 

Representing  the  binding  between  an  ob¬ 
ject  and  its  location,  as  assessed  in  the  A-not  B 
task  (Halford,  1993,  pp.  5 1  -56;  Wellman,  Cross, 
&Bartsch,  1986). 

Match -to-samplc  requires  choosing  an  ob¬ 
ject  that  matches  the  sample  (e.g.  if  shown  an 
apple  as  sample,  required  to  choose  between  an 
apple  and  a  hammer).  This  task  has  been  analy¬ 
sed  by  Prcmack  (1983)  and  Halford  et  al.  (in 
press)  and  is  an  analogy  based  on  a  unary  rela¬ 
tion.  Transferto  an  isomorphic  task  (e.g.  the  sam¬ 
ple  is  a  hammer,  and  the  choices  are  a  banana 
and  a  hammer)  demonstrates  the  principle  is  rec¬ 
ognised  independently  of  specific  content. 

This  theory  appears  to  be  unique  in  pre¬ 
dicting  a  correspondence  between  all  five  tasks. 

Rank  3  entails  symbolic  processes  based 
on  binary  relations,  which  develop  at  a  me¬ 
dian  age  of  two  years  (Halford,  1993).  Tasks 
that  can  be  used  to  test  this  level  of  perfor¬ 
mance  include; 

Binary  relational  match-to-sample  requires 
choice  of  a  pair  of  objects  that  has  the  same 
relation  as  the  sample  (e.g.  if  the  sample  is  XX, 
they  should  choose  AA  rather  than  BC.  If  the 


68 


Relational  processing  in  higher  cognition 


sample  is  XY,  they  should  choose  BC  rather 
than  AA).  This  implies  a  form  of  analogical 
reasoning  based  on  binary  relations,  a  Rank  3 
representation  (Centner  &  Stevens,  1983;  Hal¬ 
ford  et  al.,  in  press;  Holyoak  &  Thagard,  1995). 

Sorting  into  two  categories  can  be  assessed 
using  the  technique  of  Gershkoff-Stowe  et  al. 
(1997.  Balance  scale  -  weight  and  distance  rules 
requires  children  to  decide  whether  a  beam 
should  balance,  or  which  side  will  go  down, 
based  on  either  weight  or  distance,  with  the  oth¬ 
er  factor  held  constant  [Halford,  1995  #2927). 
This  requires  binary  relations  (Halford  et  al., 
in  press,  Section  6.3.1). 

Rank  4  entails  ternary  relations.  This  lev¬ 
el  of  structure  is  required  for  transitive  infer¬ 
ence,  hierarchical  classification,  class  inclusion, 
hypothesis  testing,  cardinality  and  comprehen¬ 
sion  of  sentences  (Andrews,  1996;  Halford, 
1993;  Halford  et  al.,  in  press).  Other  tests  that 
require  this  level  of  structure  include: 

Transfer  between  isomorphs  of  conditional 
discrimination  tasks  with  prediction  of  unseen 
items.  Conditional  discrimination  has  a  well 
defined  structure  that  can  be  assessed  by  trans¬ 
fer  to  isomorphs.  As  pointed  out  above,  if  the 
relations  in  the  original  task  in  Table  1  are 
learned,  and  given  any  one  item  of  the  isomor¬ 
phic  transfer  task,  the  remaining  three  items  can 
be  predicted,  irrespective  of  order  of  presenta¬ 
tion.  This  is  a  case  of  analogical  reasoning  (Cen¬ 
tner  &  Stevens,  1 983;  Holyoak  &  Thagard,  1 995) 
in  which  the  structure  of  the  original  task  (the 
base  or  source)  is  mapped  into  the  transfer  task 
(target).  The  structure  of  conditional  discrimi¬ 
nation  is  basically  a  ternary  relation,  in  that  it 
consists  of  ordered  3-tuples  (e.g. 
colour, shape, response).  Therefore,  while  origi¬ 
nal  learning  can  be  used  to  infer  nothing  more 
complex  than  configural  association  (Rank  7), 
prediction  of  unseen  items  of  a  new  isomorphic 
transfer  task  reflects  processing  ternary  relations 
(Rank  4).  The  same  paradigm  can  be  used  to  as¬ 
sess  two  different  levels  of  cognitive  process, 
with  procedure  held  constant  and  without  addi¬ 
tional  demands  such  as  articulation.  Infants 
should  be  able  to  learn  the  original  discrimina¬ 


tion  but  should  not  be  able  to  predict  unseen 
items  on  the  isomorphic  transfer  task.  Five  year 
olds  should  be  able  to  do  both.  These  predic¬ 
tions  are  more  optimistic  than  previous  findings 
that  conditional  discrimination  is  not  learned 
before  age  5  (Collin,  1966;  Rudy,  1991). 

The  tendency  to  prefer  reversal  over  non¬ 
reversal  shifts  (Kendler,  1 995).  Ability  to  make 
efficient  reversal  shifts  in  multidimensional  dis¬ 
crimination  problems  was  first  analysed  in  de¬ 
tail  by  Kendler  and  Kendler  ( 1 962)  and  there  is 
a  long  history  of  research  (see  review  by  Ken¬ 
dler,  1995,  and  commentary  by  Halford,  1997). 
Reversal  shifts  depend  on  representation  of  the 
relevant  dimension,  which  requires  processing 
a  ternary  relation,  because  a  dimension  is  a  set 
on  which  an  asymmetric,  transitive  relation  is 
defined.  Representation  of  a  dimension  requires 
induction  of  a  relational  schema  (Halford,  Bain, 
et  al.,  in  press).  Consequently  this  longstand¬ 
ing  enigma  can  be  explained  as  a  form  of  rela¬ 
tional  processing.  Many  predictions  follow 
from  this,  but  the  one  on  which  ive  will  focus 
here  is  that  preference  of  reversal  shifts  should 
correspond  to  other  ternary  relations  tasks. 

Ranks  5  and  6  entail  quaternary  and  qui¬ 
nary  relations  respectively.  Rank  5  is  typically 
understood  at  age  1 1  (Halford,  1993),  but  there 
is  virtually  no  useable  data  on  Rank  6,  though 
it  is  believed  to  occur  only  in  a  minority  of 
adults.  However  we  will  consider  two  tasks  that 
appear  to  require  this  level  of  processing,  but 
have  not  been  analysed  in  this  way  before. 

RELATIONAL  PROCESSES  IN 
REASONING 

In  this  section  we  will  consider  how  rela¬ 
tional  processes  could  be  involved  in  two  rea¬ 
soning  tasks,  knights  and  knaves  and  categori¬ 
cal  syllogisms. 

Knights  and  knaves  problems  are  based  on 
the  following  scenario.  Suppose  there  is  an  is¬ 
land  where  there  are  just  two  sorts  of  inhabit¬ 
ants  -  knights  who  always  tell  the  truth  and 
knaves  who  always  lie.  An  example  problem 
is:  A  says  *7  am  a  knave  and  B  is  a  knave*’.  B 
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says,  ''A  is  a  knave**.  What  is  the  status  of  A 
and  B:  Knight,  knave,  or  impossible  to  tell? 
(Rips,  1989,  pp.  85-86).  The  solution  entails 
two  or  more  steps,  but  we  focus  on  the  step  that 
requires  the  highest  relational  complexity:  If  we 
assume  /lisa  knight,  then  >4*s  statement  that  A 
and  Bare  knaves  must  be  true,  but>4  says^  is  a 
knave,  which  is  a  contradiction.  Therefore  A 
must  be  a  knave.  Symbolically: 

kt(A)  and  says(A,(kv(A)  and  kv(B)))  /E 
kv(A). 

Using  the  type  of  analysis  developed  by 
Halford  et  al.  (in  press-a)  there  are  five  vari¬ 
ables  in  this  expression,  corresponding  to  the 
five  underlined  arguments.  Therefore  this  in¬ 
ference  is  quinary.  The  second  step  is  to  reason 
that  if  it  is  false  that  A  and  B  are  knaves,  and 
that  >4  is  a  knave,  then  B  must  be  a  knight: 
false(kv(A)  and  kv(E))  and  kv(A)  JE  kt(E).  This 
step  is  quaternary,  so  task  complexity,  defined 
by  the  most  complex  step,  is  quinary. 

Categorical  syllogism  tasks  have  been  more 
extensively  investigated,  but  we  will  focus  on 
the  following  example  tasks: 

All  A  are 5,  all  B  are  C.  This  would  be  rep¬ 
resented  by  Johnson-Laird  &  Byrne  (1991 ,  Ta¬ 
ble  6. 1 )  as  the  mental  model:  f[a]b]c.  This  men¬ 
tal  model  can  be  expressed  as  a  relation  between 
the  following  classes  of  entities  (where  -»/l 
means  “not  A**):  ABC,  -lABC,  -t/I-tBC.  We  can 
think  of  this  as  follows:  There  is  one  class  of 
entities  with  properties  >4,5  and  C,  another  class 
with  properties  not->4,  B  and  C,  and  another 
class  with  properties  not  -A,  not  -B  and  C.  The 
mental  model  that  relates  these  three  classes  has 
the  complexity  of  a  ternary  relation.  Now  con¬ 
sider  the  syllogism: 

Some  A  are  B,  No  B  are  C.  The  premises 
express  a  relation  between  the  following  class¬ 
es  of  entities:  A-^BC,  i4“«5*nC,  AB-^C, 
n>45-.C(c.f,  J-L&B,  1991 .  Table  6.1).  The  prob¬ 
lem  relates  5  classes  of  entities,  so  it  has  the 
complexity  of  a  quinary  relation.  J-L&B  define 
complexity  in  terms  of  the  number  of  mental 
models  required  for  a  problem.  The  first  prob¬ 
lem  above  requires  one  model  and  is  easy  (88% 
correct)  while  the  second  requires  3  models  and 


is  difficult  (38%  correct).  However  more  diffi¬ 
cult  problems  tend  to  entail  more  complex  re¬ 
lations.  Of  the  27  syllogisms  with  valid  con¬ 
clusions,  there  arc  7  with  ternary  relations  that 
entail  1  mental  model,  and  17  with  relations 
more  complex  than  ternary'  that  entail  more  than 
1  mental  model  (contingency  coefficient  C  = 
.61).  Therefore  the  relational  complexity  met¬ 
ric  has  potential  to  provide  an  alternative  ex¬ 
planation  to  number  of  mental  models  for  dif¬ 
ficulty  of  categorical  syllogisms. 

CONCLUSION 

We  wish  to  propose  that  the  representation 
and  processing  of  structure,  including  analogi¬ 
cal  mapping,  are  core  processes  in  higher  cog¬ 
nition.  They  can  be  used  as  criteria  for  distin¬ 
guishing  tasks  that  demand  higher  cognitive 
processes  from  those  that  can  be  performed  by 
more  basic  processes.  Ability  to  fonn  analo¬ 
gies  can  also  be  used  as  criterion  for  neural  net 
models  of  higher  cognitive  processes.  The  re¬ 
lational  complexity  metric  permits  levels  of 
structure  to  be  distinguished. 

Cognitive  tasks  can  be  grouped  into 
equivalence  classes  of  equal  structural  com¬ 
plexity,  and  the  classes  can  be  ordered  ac¬ 
cording  to  representational  rank.  Ranks  0  and 
1  are  associative,  do  not  entail  explicit  rep¬ 
resentation  of  structure,  and  do  not  enable 
analogical  mappings  to  be  made.  Ranks  2-6 
entail  explicit  representation  of  relations, 
from  unary  to  quinary.  In  general  they  have 
the  properties  normally  attributed  to  higher 
cognitive  processes.  There  is  a  correspon¬ 
dence  between  three  domains:  level  of  struc¬ 
tural  complexity,  neural  net  architecture,  and 
observable  properties  of  performance. 
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1.  INTRODUCTION 

Despite  the  growing  appreciation  of  the 
relevance  of  affect  to  cognition,  analogy  re¬ 
searchers  have  paid  remarkably  little  attention 
to  emotion.  This  paper  discusses  three  gener¬ 
al  classes  of  analogy  that  involve  emotions. 
The  most  straightforward  are  analogies  and 
metaphors  emotions,  for  example  *‘Love 
is  a  rose  and  you  better  not  pick  it.”  Much  more 
interesting  are  analogies  that  involve  the  trans¬ 
fer  of  emotions,  for  example  in  empathy  in 
which  people  understand  the  emotions  of  oth¬ 
ers  by  imagining  their  own  emotional  reactions 
in  similar  situations.  Finally,  there  are  analo¬ 
gies  that  generate  emotions,  for  example  ana¬ 
logical  jokes  that  generate  emotions  such  as 
surprise  and  amusement. 

Understanding  emotional  analogies  re¬ 
quires  a  more  complex  theory  of  analogical  in¬ 
ference  than  has  been  currently  available,  and 
section  2  presents  a  new  account  that  shows 
how  analogical  inference  can  be  defeasible,  ho¬ 
listic,  multiple,  and  emotional,  in  ways  to  be 
described.  Analogies  about  emotions  can  to 
some  extent  be  explained  using  the  standard 
models  such  as  ACME  and  SME,  but  analo¬ 
gies  that  transfer  emotions  require  an  extended 
treatment  that  appreciates  the  special  character 
of  emotional  states,  I  describe  HOTCO,  a  new 
model  of  emotional  coherence,  that  simulates 
transfer  of  emotions.  Finally,  I  show  how  HOT¬ 
CO  models  the  generation  of  emotions  such  as 
reactions  to  humorous  analogies. 


2.  ANALOGICAL  INFERENCE: 

CURRENT  MODELS 

In  logic  books,  analogical  inference  is  usu¬ 
ally  presented  by  a  schema  such  as  the  follow¬ 
ing  (Salmon,  1984,  p.  105): 

Objects  of  type  X  have  properties  G,  //,  etc. 

Objects  of  type  Fhave  properties  G,  //,  etc. 

Objects  of  type  X  have  property  F. 

Therefore:  Objects  of  type  Fhave  property  F. 

For  example,  when  experiments  deter¬ 
mined  that  large  quantities  of  the  artificial 
sweetener  saccharine  caused  bladder  cancer  in 
rats,  scientists  analogized  that  It  might  also  be 
carcinogenic  in  humans.  Logicians  routinely 
point  out  that  analogical  arguments  may  be 
strong  or  week  depending  on  the  extent  to  which 
the  properties  in  the  premises  arc  relevant  to 
the  property  in  the  conclusion. 

This  characterization  of  analogical  infer¬ 
ence,  which  dates  back  at  least  to  John  Stuart 
Mill's  nineteenth-century  System  of  Logic,  is 
flawed  in  several  respects.  First,  logicians  rare¬ 
ly  spell  out  what  “relevant”  means,  so  the  sche¬ 
ma  provides  little  help  in  distinguishing  strong 
analogies  from  weak.  Second,  the  schema  is 
stated  in  terms  of  objects  and  their  properties, 
obscuring  the  fact  that  the  strongest  and  most 
useful  analogies  involve  relations,  in  particu¬ 
lar  causal  relations  (Centner,  1983;  Holyoak 
and  Thagard,  1995).  Such  causal  relations  arc 
usually  the  key  to  determining  relevance:  if,  in 
the  above  schema,  G  and  H  together  cause  F  in 
X,  then  analogically  they  may  cause  F in  K,  pro- 
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ducing  a  much  stronger  inference  than  just 
counting  properties.  Third,  logicians  typically 
discuss  analogical  arguments  and  tend  to  ignore 
the  complexity  of  analogical  inference,  which 
requires  a  more  holistic  assessment  of  a  poten¬ 
tial  conclusion  with  respect  to  other  informa¬ 
tion.  There  is  no  point  in  inferring  that  objects 
of  type  Thave  property  F  if  you  already  know 
of  many  such  objects  that  lack  F,  or  if  a  differ¬ 
ent  analogy  suggests  that  they  do  not  have  F. 
Analogical  inference  must  be  defeasible,  in  that 
the  potential  conclusion  can  be  overturned  by 
other  information,  and  it  must  be  holistic  in  that 
everything  the  inference  maker  knows  is  po¬ 
tentially  relevant  to  overturning  6r  enhancing 
the  inference. 

Compared  to  the  logician’s  schema,  much 
richer  accounts  of  the  structure  of  analogies 
have  been  provided  by  computational  models 
of  analogical  mapping  such  as  SME  (Falken- 
hainer,  Forbus,  and  Centner,  1989)  and  ACME 
(Holyoak  and  Thagard,  1989).  SME  uses  rela¬ 
tional  structure  to  generate  candidate  inferenc¬ 
es,  and  ACME  transfers  information  from  a 
source  analog  to  a  target  analog  using  a  pro¬ 
cess  that  Holyoak,  Novick  and  Melz  (1994) 
called  copying  with  substitution  and  generation 
(CWSG).  Similar  processes  are  used  in  case- 
based  reasoning  (Kolodner,  1993),  and  in  many 
other  computational  models  of  analogy. 

But  all  of  these  computational  models  are 
inadequate  for  understanding  analogical  infer¬ 
ence  in  general  and  emotional  analogies  in  par¬ 
ticular.  They  do  not  show  how  analogical  infer¬ 
ence  can  be  defeasible,  holistic,  and  multiple  - 
making  use  of  more  than  one  analogy  to  support 
or  defeat  a  conclusion.  Moreover,  the  prevalent 
models  of  analogy  encode  information  symbol¬ 
ically  and  assume  that  what  is  inferred  is  verbal 
information  that  can  be  represented  in  proposi¬ 
tional  form  by  predicate  calculus  or  some  simi¬ 
lar  representational  system.*  But  as  section  5 
documents,  analogical  inference  often  serves  to 
transfer  an  emotion,  not  just  the  verbal  repre¬ 
sentation  of  an  emotion.  I  will  now  describe  how 
a  new  model  of  emotional  coherence,  HOTCO, 
can  perform  analogical  inferences  that  are  de¬ 
feasible,  holistic,  multiple,  and  emotional. 


3.  ANALOGICAL  INFERENCE  IN 
HOTCO 

I  recently  proposed  a  theory  of  emotional 
coherence  that  has  applications  to  numerous 
important  psychological  phenomena  such  as 
trust  (Thagard,  forthcoming).  This  theory  makes 
the  following  assumptions  about  inference  and 
emotions: 

1 )  All  inference  is  coherence-based.  So-called 
rules  of  inference  such  as  modus  ponens 
do  not  by  themselves  license  inferences,  be¬ 
cause  their  conclusions  may  contradict  oth¬ 
er  accepted  information.  The  only  rule  of 
inference  is:  Accept  a  conclusion  if  its  ac¬ 
ceptance  maximizes  coherence. 

2)  Coherence  is  a  matter  of  constraint  satis¬ 
faction,  and  can  be  computed  by  connec- 
tionist  and  other  algorithms  (Thagard  and 
Verbeurgt,  1998). 

3)  There  are  six  kinds  of  coherence:  analogi¬ 
cal,  conceptual,  explanatory,  deductive, 
perceptual,  and  deliberative  (Thagard,  Eli- 
asmith,  Rusnock,  and  Shelley,  forthcom¬ 
ing). 

4)  Coherence  is  not  just  a  matter  of  accepting 
or  rejecting  a  conclusion,  but  can  also  in¬ 
volve  attaching  a  positive  or  negative  emo¬ 
tional  assessment  to  a  proposition,  object, 
concept,  or  other  representation. 

From  this  coherentist  perspective,  inference 
takes  on  a  very  different  complexion  from  what 
is  suggested  by  logical  deduction.  Philosophers 
who  have  advocated  coherentist  accounts  of 
inference  include  Bosanquet  (1920)  and  Har¬ 
man  (1986). 

The  computational  model  HOTCO  (for 
“hot  coherence”)  implements  these  theoretical 
assumptions.  It  amalgamates  the  following  pre¬ 
vious  coherence  models  of  coherence: 

•  Explanatory  coherence:  ECHO  (Thagard, 
1989,  1992); 


‘  One  of  the  few  attempts  to  deal  with  nonverbal  anal¬ 
ogies  is  the  VAMP  system  for  visual  analogical  mapping: 
Thagard,  Gochfeld,  and  Hardy  (1992). 
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•  Conceptual  coherence:  IMP  (Kunda  and 
Thagard,  1996); 

•  Analogical  coherence:  ACME  (Holyoak 
and  Thagard,  1989); 

•  Deliberative  coherence:  DECO  (Thagard 
and  Millgram,  1995). 

Amalgamation  is  natural,  because  all  of 

these  models  use  a  similar  conncctionist  algo¬ 
rithm  for  maximizing  constraint  satisfaction, 
although  they  employ  different  constraints  op¬ 
erating  on  different  kinds  of  representation. 
What  is  novel  about  HOTCO  is  that  represen¬ 
tational  elements  possess  not  only  activations 
that  represent  their  acceptance  and  rejection, 
but  also  valences  that  represent  a  judgment  of 
their  positive  or  negative  emotional  appeal.  In 
HOTCO,  as  in  its  component  models,  inferenc¬ 
es  about  what  to  accept  are  made  by  a  holistic 
process  in  which  activation  spreads  through  a 
network  of  units  with  excitatory  and  inhibitory 
links,  representing  elements  with  positive  and 
negative  constraints.  But  HOTCO  spreads  va¬ 
lences  as  well  as  activations  in  a  similar  holis¬ 
tic  fashion,  using  the  same  system  of  excitato¬ 
ry  and  inhibitory  links.  For  example,  HOTCO 
models  the  decision  of  whether  to  hire  a  partic¬ 
ular  person  as  a  babysitter  as  in  part  a  matter  of 
“cold”  deliberative,  explanatory,  conceptual, 
and  analogical  coherence,  but  also  as  a  matter 
of  generating  an  emotional  reaction  to  the  can¬ 
didate.  The  emotional  reaction  derives  from  a 
combination  of  the  cold  inferences  made  about 
the  person  and  the  valences  attached  to  what  is 
inferred.  For  example,  if  you  infer  that  that  a 
babysitting  candidate  is  responsible,  intelligent, 
and  likes  children,  the  positive  valence  of  these 
attributes  will  spread  to  him  or  her;  whereas  if 
coherence  leads  to  you  infer  that  the  candidate 
is  lazy,  dumb,  and  psychopathic,  he  or  she  will 
acquire  a  negative  valence.  In  HOTCO,  valenc¬ 
es  spread  through  the  constraint  network  in 
much  the  same  way  that  activation  does  (see 
Thagard,  forthcoming,  for  technical  details). 

Now  I  can  dcvscribe  how  HOTCO  performs 
analogical  inference  in  a  way  that  is  defeasi¬ 
ble,  holistic,  multiple,  and  emotional.  HOTCO 
uses  ACME  to  perform  analogical  mapping 


between  a  source  and  a  target,  and  copying  with 
substitution  and  generation  to  produce  new 
propositions  to  be  inferred.  It  can  operate  ei¬ 
ther  in  a  broad  mode  in  which  everything  about 
the  source  is  transferred  to  the  target,  or  in  a 
more  specific  mode  in  which  a  query  is  used  to 
enhance  the  target  using  a  particular  proposi¬ 
tion  in  the  source.  Here,  in  predicate  calculus 
formalization  where  each  proposition  has  the 
structure  (predicate  (objects)  proposition- 
name),  is  an  example  of  scientific  inference 
(Shelley  forthcoming): 

Source  1 :  centroscymnus 

(have  (centroscymnus  rod-pigment- 1 ) 
have-1 

(absorb  (rod-pigment- 1  472nm-light)  ab¬ 
sorb!) 

(penetrate  (472nm -light  deep-ocean- wa¬ 
ter)  penetrate!) 

(see-in  (centroscymnus  dccp-occan- water) 
see-in!) 

(inhabit  (centroscymnus  deep-ocean- wa¬ 
ter)  inhabit-!) 

(enable  (have-!  see-in- !)  enable-!) 
(because  (absorb- 1  penetrate- ! )  because- ! ) 
(adapt  (see -in  !  inhabit- ! )  adapt- 1 ) 
Target:  coelacanth-3 

(have  (coelacanth  rod-pigmcnt-3)  have-3) 

(absorb  (rod-pigmcnt-3  473nm-light)  ab- 
sorb-3) 

(penetrate  (473nm-!ight  decp-occan-water) 
penetrate- 3) 

(sec-in  (coelacanth  decp-occan-water)  sec- 
in-3) 

(enable  (have-3  sec-in-3)  enablc-3) 

(because  (absorb-3  pcnetrate-3)  bccause-3) 

Operating  in  specific  mode,  HOTCO  is  asked 
what  depth  the  coelacanth  inhabits,  and  uses 
the  proposition  INHABIT- 1  in  the  source 
to  construct  for  the  target  the  proposition 
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(inhabit  (coelacanth  deep-ocean- water)  in- 
habit-new) 

Operating  in  broad  mode  and  doing  gener¬ 
al  CWSG,  HOTCO  can  analogically  transfer 
everything  about  the  source  to  the  target,  in  this 
case  generating  the  same  proposition  as  a  can¬ 
didate  to  be  inferred. 

However,  HOTCO  does  not  actually  infer 
the  new  proposition,  because  analogical  infer¬ 
ence  is  defeasible.  Rather,  it  simply  establish¬ 
es  an  excitatory  link  between  the  unit  represent¬ 
ing  the  source  proposition  INHABIT- 1  and  the 
target  proposition  INHABIT-NEW.  This  link 
represents  a  positive  constraint  between  the  two 
propositions,  so  that  coherence  maximization 
will  encourage  them  to  be  accepted  together  or 
rejected  together.  The  source  proposition  IN¬ 
HABIT-1  is  presumably  accepted,  so  in  the 
HOTCO  model  it  will  have  positive  activation 
which  will  spread  to  provide  positive  activa¬ 
tion  to  INHABIT-NEW,  unless  INHABIT- 
NEW  is  incompatible  with  other  accepted  prop¬ 
ositions  that  will  tend  to  suppress  its  activation. 
Thus  analogical  inference  is  defeasible,  because 
all  HOTCO  does  is  to  create  a  link  represent¬ 
ing  a  new  constraint  for  overall  coherence  judg¬ 
ment,  and  it  is  holistic,  because  the  entire  con¬ 
straint  network  can  potentially  contribute  to  the 
final  acceptance  or  rejection  of  the  inferred 
proposition. 

Within  this  framework,  it  is  easy  to  see  how 
analogical  inference  can  employ  multiple  anal¬ 
ogies,  because  more  than  one  source  can  be  used 
to  create  new  constraints.  Shelley  (forthcom¬ 
ing)  describes  how  biologists  do  not  simply  use 
the  centroscymnus  analog  as  a  source  to  infer 
that  coelacanths  inhabit  deep  water,  but  also 
use  the  following  different  source: 

Source  2:  ruvettus-2 

(have  (ruvettus  rod-pigment-2)  have-2) 

(absorb  (rod-pigment-2  474nm-light)  ab- 
sorb-2) 

(penetrate  (474nm-light  deep-ocean- water) 
penetrate-2) 

(see-in  (ruvettus  deep-ocean-water)  see-in-2) 


(inhabit  (ruvettus  deep-ocean- water)  inhab¬ 
it-2) 

(enable  (have-2  see-in-2)  enable-2) 

(because  (absorb-2  penetrate-2)  because-2) 

(adapt  (see-in-2  inhabit-2)  adapt-2) 

The  overall  inference  is  that  coelacanths 
inhabit  deep  water  because  they  are  like  the 
centroscysmus  and  the  ruvettus  sources  in  hav¬ 
ing  rod  pigments  that  are  an  adaptation  to  deep 
water.  Notice  that  these  are  deep,  systematic 
analogies,  because  the  theory  of  natural  selec¬ 
tion  suggests  that  the  two  source  fishes  have 
the  rod  pigments  because  they  are  adaptive  for 
their  deep  ocean  water  environments.  When 
HOTCO  maps  the  ruvettus  source  to  the  coele- 
canth  target  after  mapping  the  centroscysmus 
source,  it  creates  links  excitatory  from  the  in¬ 
ferred  proposition  INHABIT-NEW  with  both 
INHABIT- 1  in  the  first  source  and  INHABIT- 
2  in  the  second  source.  Hence  activation  can 
flow  from  both  these  propositions  to  INHAB- 
IT-NEW,  so  that  the  inference  is  supported  by 
multiple  analogies.  If  another  analog  suggests 
a  contradictoiy  inference,  then  INHABIT-NEW 
will  be  both  excited  and  inhibited.  Thus  multi¬ 
ple  analogies  can  contribute  to  the  defeasible 
and  holistic  character  of  analogical  inference. 

The  new  links  created  between  the  target 
proposition  and  the  source  proposition  can  also 
make  possible  emotional  transfer.  The  coela¬ 
canth  example  is  emotionally  neutral,  but  if  an 
emotional  valence  were  attached  to  INHAB¬ 
IT-1  and  INHABIT-2,  then  the  excitatory  links 
between  them  and  INHABIT-NEW  would 
make  possible  spread  of  that  valence  as  well  as 
spread  of  activation  representing  acceptance. 
Section  5  below  provides  detailed  examples  of 
this  kind  of  emotional  analogical  inference. 

4.  ANALOGIES  ABOUT  EMOTIONS 

The  Columbia  Dictionary  of  Quotations 
(available  electronically  as  part  of  the  Microsoft 
Bookshelf)  contains  many  metaphors  and  anal¬ 
ogies  concerning  love  and  other  emotions.  For 
example,  love  is  compared  to  religion,  a  mas- 
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ter,  a  pilgrimage,  an  angel/bi rd,  gluttony,  war, 
disease,  drunkenness,  insanity,  market  ex¬ 
change,  light,  ghosts,  and  smoke.  It  is  not  sur¬ 
prising  that  writers  discuss  emotions  non-liter- 
ally,  because  it  is  very  difficult  to  describe 
emotions  straightforwardly  in  words.  In  analo¬ 
gies  about  emotions,  veital  sources  help  to  il¬ 
luminate  the  emotional  target,  which  may  be 
verbally  described  but  which  also  has  an  elu¬ 
sive,  non-verbal,  phenomenological  aspect. 
Analogies  are  also  used  about  negative  emo¬ 
tions:  anger  is  like  a  volcano,  jealousy  is  a 
green-eyed  monster,  and  so  on. 

In  order  to  handle  the  complexities  of  emo¬ 
tion,  poets  often  resort  to  multiple  analogies, 
as  in  the  following  examples: 

(1)  John  Donne: 

Love  was  as  subtly  catched,  as  a  disease; 

But  being  got  it  is  a  treasure  sweet. 

(2)  Robert  Burns: 

O,  my  love  is  like  a  red,  red  rose. 

That’s  newly  sprung  in  June: 

My  love  is  like  a  melodic, 

That’s  sweetly  play’d  in  tunc. 

(3)  William  Shakespeare: 

Love  is  a  smoke  made  with  the  fume  of  sighs, 

Being  purged,  a  fire  sparkling  in  lovers’  eyes, 

Being  vexed,  a  sea  nourished  with  lovers’  tear?. 

What  is  it  else?  A  madness  most  discreet, 

A  choking  gall  and  a  preserving  sweet. 

In  each  of  these  examples,  the  poet  uses 
more  than  one  analogy  or  metaphor  to  bring 
out  different  aspects  of  love.  The  use  of  multi¬ 
ple  analogies  is  different  from  the  scientific 
example  described  in  the  last  section,  in  which 
the  point  of  using  two  marine  sources  was  to 
support  the  same  conclusion  about  the  depths 
inhabited  by  coelacanths.  In  these  poetic  ex¬ 
amples,  different  source  analogs  bring  out  dif¬ 
ferent  aspects  of  the  target  emotion,  love. 

Analogies  about  emotions  may  be  general, 
as  in  the  above  examples  about  love,  or  partic¬ 
ular,  used  to  describe  the  emotional  state  of  an 
individual.  For  example,  in  the  movie  Man^- 
in  *s  Room,  the  character  played  by  Meryl  Streep 
describes  her  reluctant  to  discuss  her  emotions 


by  saying  that  her  feelings  are  like  fishhooks  - 
you  can’t  pick  up  just  one.  Just  as  it  is  hard  to 
verbalize  the  general  character  of  an  emotion, 
it  is  often  difficult  to  describe  verbally  one’s 
own  emotional  state.  Victims  of  post-traumat¬ 
ic  stress  disorder  frequently  use  analogies  and 
metaphors  to  describe  their  own  situations  (Me- 
ichenbaum  (1994,  pp.  112-113): 

•  I  am  a  time  bomb  ticking,  ready  to  explode. 

•  I  feel  like  I  am  caught  up  in  a  tornado. 

•  I  am  a  rabbit  stuck  in  the  glare  of  head¬ 
lights  who  can’t  move. 

•  My  life  is  like  a  rerun  of  a  movie  that  won’t 
stop. 

•  I  feel  like  I’m  in  a  cave  and  can’t  get  out. 

•  Home  is  like  a  pressure  cooker. 

•  I  am  a  robot  with  no  feelings. 

In  these  particular  emotional  analogies,  the 
target  to  be  understood  is  the  emotional  state 
of  an  individual,  and  the  verbal  source  describes 
roughly  what  the  person  feels  like. 

The  purpose  of  analogies  about  emotions 
is  often  explanatory,  describing  the  nature  of  a 
general  emotion  or  a  particular  person’s  emo¬ 
tional  state.  But  analogy  can  also  be  used  to 
help  deal  with  emotions,  as  in  the  following 
anonymous  example: 

Happiness  is  like  a  butterfly. 

The  more  you  chase  it  and  chase  it  directly 
the  more  it  eludes  you,  but 
if  you  sit  quietly  and  turn  your  attention 
to  other  things 

it  comes  and  softly  sits  on  your  shoulder. 
People  are  also  given  advice  on  how  to  deal 
with  negative  emotions,  being  told  for  exam¬ 
ple  to  “vent”  their  anger,  or  to  “put  a  lid  on  it.” 

In  principle,  analogies  about  emotions 
could  be  simulated  by  the  standard  models  such 
as  ACME  and  SME,  with  a  verbal  representa¬ 
tion  of  the  source  being  used  to  generate  infer¬ 
ences  about  the  emotional  target.  However, 
even  in  some  of  the  above  examples,  the  point 
of  the  analogy  is  not  just  to  transfer  verbal  in¬ 
formation,  but  also  to  transfer  an  emotional  at¬ 
titude.  When  someone  says  “I  feel  like  I  am 
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caught  up  in  a  tornado,”  he  or  she  may  be  say¬ 
ing  something  like  “My  feelings  are  like  the 
feelings  you  would  have  if  you  were  caught  in 
a  tornado  ”  To  handle  the  transfer  of  emotions, 
we  need  to  go  beyond  verbal  analogy, 

5.  ANALOGIES  THAT  TRANSFER 
EMOTIONS 

As  already  mentioned,  not  all  analogies  are 
verbal:  some  involve  transfer  of  visual  repre¬ 
sentations  (Holyoak  and  Thagard,  1995).  In 
addition,  analogies  can  involve  transfer  of  emo¬ 
tions  from  a  source  to  a  target.  There  are  at  least 
three  such  kinds  of  emotional  transfer,  involved 
in  persuasion,  empathy,  and  self-explanation. 
In  persuasion,  I  may  use  an  analogy  to  convince 
you  to  adopt  an  emotional  attitude.  In  empa¬ 
thy,  I  try  to  understand  your  enlotional  reac¬ 
tion  to  a  situation  by  transferring  to  you  my 
emotional  reaction  to  a  similar  situation.  In  self¬ 
explanation,  I  try  to  get  you  to  understand  my 
emotion  by  comparing  my  situation  and  emo¬ 
tional  response  to  it  with  situations  and  respons¬ 
es  familiar  to  you. 

The  purpose  of  many  persuasive  analogies 
is  to  produce  an  emotional  attitude,  for  exam¬ 
ple  when  at  attempt  is  made  to  con  vince  some¬ 
one  that  abortion  is  abominable  or  that  capital 
punishment  is  highly  desirable.  If  I  want  to  get 
someone  to  adopt  positive  emotions  toward 
something,  I  can  compare  it  to  something  else 
toward  which  he  or  she  already  has  a  positive 
attitude..  Conversely,  I  can  try  to  produce  a 
negative  attitude  by  comparison  with  something 
already  viewed  negatively.  The  structure  of 
persuasive  emotional  analogies  is: 

You  have  an  emotional  appraisal  of  the 

source  S. 

The  target  T  is  like  S  in  relevant  respects. 

So  you  should  have  a  similar  emotional  ap¬ 
praisal  of  T. 

Of  course,  the  emotional  appraisal  could 
be  represented  verbally  by  terms  such  as  “won¬ 
derful,”  “awful,”  and  so  on,  but  for  persuasive 
purposes  it  is  much  more  effective  if  the  gut 


feeling  that  is  attached  to  something  can  be 
transferred  over  to  something  else.  For  exam¬ 
ple,  the  point  of  analogizing  using  as  sources 
such  emotionally  intense  subjects  as  the  Holo¬ 
caust  or  infanticide  is  to  transfer  negative  emo¬ 
tions  to  the  target. 

Blanchette  and  Dunbar  (1997)  thoroughly 
documented  the  use  of  persuasive  analogies  in 
a  political  context,  the  1995  referendum  in 
which  the  people  of  Quebec  voted  whether  to 
separate  from  Canada.  In  three  Montreal  news¬ 
papers,  they  found  a  total  of  234  different  anal¬ 
ogies,  drawn  from  many  diverse  source  do¬ 
mains:  politics,  sports,  business,  and  so  on. 
Many  of  these  analogies  were  emotional:  66 
were  coded  by  Blanchette  and  Dunbar  as  emo¬ 
tionally  negative,  and  75  were  judged  to  be 
emotionally  positive.  Thus  more  than  half  of 
the  analogies  used  in  the  referendum  had  an 
identifiable  emotional  dimension.  For  example, 
the  side  opposed  to  Quebec  separation  said  “It’s 
like  parents  getting  a  divorce,  and  maybe  the 
parent  you  don’t  like  getting  custody.”  Here  the 
negative  emotional  connotation  of  divorce  is 
transferred  over  to  Quebec  separation.  In  con¬ 
trast,  the  yes  side  used  positive  emotional  ana¬ 
logs  for  separation:  “A  win  from  the  YES  side 
would  be  like  a  magic  wand  for  the  economy.” 

HOTCO  can  naturally  model  the  use  of 
emotional  persuasive  analogies.  The  separation- 
divorce  analogy  can  be  represented  as  follows: 

Source ;  divorce 

(married  (spouse- 1  spouse-2)  married- 1) 
(have  (spouse- 1  spouse-2  child)  have-1) 

(divorce  (spouse- 1  spouse-2)  divorce- 1) 
negative  valence 

(get-custody  (spouse- 1 )  get-custody- 1 ) 

(not-liked  (spouse- 1)  get-custody-1)  neg¬ 
ative  valence 

Target:  separation 

(part-of  (Quebec  Canada)  part-of-2) 

(govern  (Quebec  Canada  people-of-Que- 
bec)  govem-2) 
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(separate-from  (Quebec  Canada)  separate- 
from — 2) 

(control  (Quebec  people-of-Quebcc)  con¬ 
trol-2) 

When  HOTCO performs  a  broad  inference 
on  this  example  (TO  BE  RUN),  it  should  not 
only  perform  the  analogical  mapping  from  the 
source  to  the  target  and  complete  the  target 
using  copying  with  substitution  and  genera¬ 
tion,  but  also  transfer  the  negative  valence  at¬ 
tached  to  the  proposition  DIVORCE- 1  to  SEP- 
ARATE-FROM-2. 

Persuasive  analogies  have  been  rampant  in 
the  recent  debated  about  whether  Microsoft  has 
been  engaging  in  monopolistic  practices  by  in¬ 
cluding  its  World  Wide  Web  browser  in  its 
operating  system,  Windows  98.  In  response  to 
the  suggestion  that  Microsoft  also  be  required 
to  include  the  rival  browser  produced  by  its 
competitor,  Netscape.  Microsoft’s  chairman 
Bill  Gates  complained  that  this  would  be  “like 
requiring  Coca-Cola  to  include  three  cans  of 
Pepsi  in  very  six-pack  it  sells,”  or  like  “order¬ 
ing  Ford  to  sell  autos  fitted  with  Chrysler  en¬ 
gines.”  These  analogies  are  in  part  emotional, 
since  they  are  intended  to  transfer  the  emotion¬ 
al  response  to  coercing  Coca-Cola  and  Ford  - 
assumed  to  be  ridiculous  ~  over  to  the  coercion 
of  Microsoft.  On  the  other  hand,  critics  of  Mi¬ 
crosoft’s  near-monopoly  on  personal  comput¬ 
er  operating  systems  have  been  comparing 
Gates  to  John  D.  Rockefeller,  whose  predatory 
Standard  Oil  monopoly  on  petroleum  products 
was  broken  up  by  the  U.S.  government  in  1911. 

Another,  more  personal,  kind  of  persua¬ 
sive  emotional  analogy  is  identification,  in 
which  you  identify  with  someone  and  then 
transfer  positive  emotional  attitudes  about 
yourself  to  them.  According  to  Fenno  (1978, 

Source:  you 

fire  (boss,  you):  si -fire 

lose  (you,  job):  s2-lo.se 


cause  (si -fire,  s2-lose):  s3 


p.  58),  members  of  the  U.S.  congress  try  to 
convey  a  sense  of  identification  to  their  con¬ 
stituents  The  message  is  “You  know  me,  and 
I’m  like  you,  so  you  can  trust  me.”  The  struc¬ 
ture  of  this  kind  of  identification  is; 

You  have  a  positive  emotional  appraisal  of 
yourself  (source). 

I  (the  target)  am  similar  to  you. 

So  you  should  have  a  positive  emotional 
appraisal  of  me. 

This  is  a  kind  of  persuasive  analogy,  but 
differs  from  the  general  case  in  that  the  source 
and  target  are  the  people  involved. 

Empathy  also  involves  transfer  of  emotion¬ 
al  states  between  people;  see  Barnes  and 
Thagard  (1997)  for  a  full  discussion.  It  differs 
from  persuasion  in  that  the  goal  of  the  analogy 
is  to  understand  rather  than  to  convince  some¬ 
one.  Summarizing,  the  basic  structure  is: 

You  arc  in  situation  T  (target). 

When  I  was  in  a  similar  situation  S,  I  felt 
emotion  E  (source). 

So  maybe  you  arc  feeling  an  emotion  sim¬ 
ilar  to  E. 

As  with  persuasion  and  identification,  such 
analogizing  could  be  done  purely  verbally,  but 
it  is  much  more  effective  to  actually  feel  .some¬ 
thing  like  what  the  target  person  is  feeling.  For 
example,  if  I  want  to  understand  the  emotional 
state  of  a  new  graduate  student  just  arrived  from 
a  foreign  country,  I  can  recall  my  emotional 
state  of  anxiety  and  confusion  when  I  went  to 
study  in  England.  Here  is  a  more  detailed  ex¬ 
ample  of  empathy  involving  someone  trying  to 
understand  the  distress  of  Shakespeare’s  Ham¬ 
let  at  losing  his  father  by  comparing  it  to  his  or 
herown  loss  of  a  job  (from  Barnes  and  Thagard, 
1997): 

Target:  Hamlet 
kill  (uncle,  father):  tl-kill 
lose  (Hamlet,  father):  l2-Iosc 
marry  (uncle,  mother):  t3-marry 
cause  (tl-kill,  t2-lose):  t3a 
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angry  (you):  s4-angry  angry  (Hamlet):  t4-angry 

depressed  (you):  s5-depressed  depressed  (Hamlet):  t5-depressed 

cause  (s2>lose,  s4-angry):  s6  cause  (t2-lose,  t4-angry):  t6 

cause  (s2-lose,  s5 -depressed):  s7  cause  (t2-lose,  t5-depressed):  t7 

indecisive  (you):  s8-indecisive 
cause  (s5-depressed,  s8-indecisive):  s9 


The  puipose  of  this  analogy  is  not  simply  to 
draw  the  obvious  correspondences  between  the 
source  and  the  target,  but  to  transfer  over  your 
remembered  image  of  depression  to  Hamlet. 

Unlike  persuasive  analogies,  whose  main 
function  is  to  transfer  positive  or  negative  va¬ 
lence,  empathy  requires  transfer  of  the  full  range 
of  emotional  responses.  Depending  on  his  or  her 
situation,  I  need  to  imagine  someone  being  an- 
giy,  fearful,  disdainful,  ecstatic,  enraptured  and 
so  on.  As  currently  implemented,  HOTCO  trans¬ 
fers  only  positive  or  negative  valences  associat¬ 
ed  with  a  proposition  or  object,  but  it  can  easily 
be  expanded  so  that  transfer  involves  an  emo¬ 
tional  vector  which  represents  a  pattern  of  acti¬ 
vation  of  numerous  units,  each  of  whose  activa¬ 
tion  represents  different  components  of  emotion. 
This  expanded  representation  would  also  make 
possible  the  transfer  of  “mixed’’  emotions. 

As  an  aside,  let  me  speculate  on  the  empathic 
origins  of  altruism.  People  are  often  altmistic, 
caring  for  the  needs  of  others  as  well  as  for  their 
own  self-interests.  From  the  perspective  of  evo¬ 
lutionary  biology,  altruism  is  a  puzzle,  because 
natural  selection  should  favor  behaviors  that 
maximize  the  transmission  of  one’s  own  genes, 
not  those  of  others.  Kin  selection  theory  provides 
a  plausible  explanation  for  why  social  insects 
such  as  bees  sacrifice  themselves  for  their  broth¬ 
ers  and  sisters,  but  barely  begins  to  explain  hu¬ 
man  altruism,  which  often  extends  beyond  rela¬ 
tives.  I  conjecture  that  altruism  is  a  byproduct  of 
two  other  developments  favored  by  natural  se¬ 
lection:  caring  for  relatives  and  analogy.  First,  it 


is  plausible  that  genetic  transmission  is  optimized 
by  caring  for  one’s  children  and  for  those  also 
involved  in  caring  for  them.  Such  care  is  greatly 
aided  by  empathy  -  the  ability  to  understand  the 
emotional  state  of  someone  by  analogy  to  one’s 
own  emotional  state.  But  second,  analogical  in¬ 
ference  is  a  general  human  capacity,  not  fully 
found  in  apes,  but  developing  in  children  around 
the  age  of  five  (Holyoak  and  Thagard,  1995). 
Presumably,  the  ability  to  analogize  was  select¬ 
ed  for  as  part  of  general  selective  pressures  for 
increasing  intelligence,  although  it  may  be  that 
analogical  inference  is  itself  a  byproduct  of  se¬ 
lection  for  other  verbal  and  inferential  abilities. 
It  is  even  possible,  I  suppose,  that  analogical  in¬ 
ference  developed  because  it  is  socially  valuable, 
for  example  in  promoting  empathy.  In  any  case, 
assuming  that  both  empathy  for  relatives  and 
analogy  developed  biologically,  altruism  could 
have  emerged  as  a  byproduct.  Our  general  ana¬ 
logical  ability  enables  use  to  empathize  with 
people  in  general,  not  just  our  immediate  rela¬ 
tives,  and  thereby  to  attach  value  to  the  needs  of 
others.  Like  the  abilities  to  do  mathematics,  com¬ 
pose  symphonies,  philosophize,  and  play  base¬ 
ball,  altruism  was  never  directly  selected  for,  but 
emerged  as  a  byproduct  of  other  valuable  psy¬ 
chological  capacities  -  empathy  and  analogy. 

Empathy  is  only  one  kind  of  explanatory 
emotional  analogy.  In  section  4,  we  already  saw 
examples  of  analogies  whose  function  is  self-ex¬ 
planation,  i.e.  to  explain  one’s  own  emotional  state 
to  another.  The  following  news  report  describes' 
an  astronaut’s  emotional  self-explanations: 
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MOSCOW  (December  2,  1997  1 :53  p.m.  EST  Reuters)  -  Astronniit 
David  Wolf  says  life  on  the  Russian  Mir  space  station  can  be  dis¬ 
tinctly  unglamorous,  with  a  load  of  chores  that  include  cleaning  the 
toilet  and  scrubbing  fluff  from  air  filters. 

Named  NASA’s  inventor  of  the  year  in  1992,  Wolf  also  describes 
feeling  a  wide  array  of  emotions,  including  the  moment  when  the 
U.S.  space  shuttle  undocked  from  Mir,  leaving  him  behind  for  his 
four-month  mission. 

“I  remember  the  place  I  last  felt  it.  Ten  years  old,  as  my  parents* 
station  wagon  pulled  away  from  my  first  summer  camp  in  southern 
Indiana.  That  satisfying  thrill  that  something  new  is  going  to  happen 
and  we  don’t  know  what  it  is  yet. 

“Life  in  space  can  also  appear  dream  like  and  cinematic,  he  said  as 
he  related  being  left  in  charge  during  another  space  walk,  when  he 
thought  of  Captain  Kirk,  hero  of  “Star  Trek.’’ 

“I  felt  like  the  kid  in  “Home  Alone”  as  I  assumed  Tolya’s  usual 
posture  at  the  central  command  post,  the  cockpit.  Or,  was  it  Kirk’s 
position?  Dream  and  reality  run  so  close  here.” 


Few  people  have  the  experience  of  being  left 
in  space,  but  most  people  can  remember  or  Imag¬ 
ine  what  it  is  like  to  leave  for  summer  camp. 
Thus  emotional  analogies  u.sed  for  self-expla¬ 
nation  have  the  function  of  enabling  others  to 
have  an  empathic  understanding  of  oneself. 

Here  is  a  final  example  of  analogical  trans¬ 
fer  of  emotion:  “Psychologists  would  rather  use 
each  other’s  toothbrushes  than  each  other’s  ter¬ 
minology.”  This  is  complex,  because  at  one 
level  it  is  projecting  the  emotional  reaction  of 
disgust  from  use  of  toothbrushes  to  use  of  ter¬ 
minology,  but  it  is  also  generating  amusement. 
Let  us  now  consider  analogies  that  go  beyond 
analogical  transfer  of  emotions  and  actually 
generate  new  emotions. 

6.  ANALOGIES  THAT  GENERATE 
EMOTIONS 

A  third  class  of  emotional  analogies  in¬ 
volves  ones  that  are  not  about  emotions  and  do 
not  transfer  emotional  states,  but  rather  serve 
to  generate  new  emotional  states.  There  are  at 
least  four  subclasses  of  emotion-generating 
analogies,  involving  humor,  irony,  discovery, 
and  motivation. 


One  of  the  most  enjoyable  uses  of  analogy 
is  to  make  people  laugh,  generating  the  emo¬ 
tional  state  of  mirth  or  amusement.  The  Uni¬ 
versity  of  Michigan  recently  ran  an  informa¬ 
tional  campaign  to  get  people  to  guard  their 
computer  passwords  more  carefully.  Posters 
warn  students  to  treat  their  computer  passwords 
like  under\\'ear:  make  them  long  and  mysteri¬ 
ous,  don’t  leave  them  lying  around,  and  change 
them  often.  The  point  of  the  analogy  is  not  to 
persuade  anyone  based  on  the  similarity  be¬ 
tween  passwords  and  underwear,  but  rather  to 
generate  amusement  that  focuses  attention  on 
the  problem  of  password  security. 

A  major  part  of  what  makes  an  analogy 
funny  is  a  surprising  combination  of  congruity 
and  incongruity.  Passwords  do  not  fit  semanti¬ 
cally  with  underu'ear,  so  it  is  surprising  when  a 
good  relational  fit  is  presented  (change  them 
often).  Other  emotions  can  also  feed  into  mak¬ 
ing  an  analogy  funny,  for  example  when  the 
analogy  is  directed  against  a  person  or  group 
one  dislikes: 

Why  do  psychologists  prefer  lawyers  to  rats 
for  their  experiments? 

I ,  There  are  now  more  lawyers  than  rats; 

2.  The  psychologists  found  they  were  getting 
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attached  to  the  rats; 

3.  And  there  are  some  things  that  rats  won’t  do. 

This  joke  depends  on  a  surprising  analogi¬ 
cal  mapping  between  rats  in  psychological  ex¬ 
periments  and  lawyers  in  their  practices,  and 
on  negative  emotions  attached  to  lawyers.  An¬ 
other  humorous  analogy  is  implicit  in  the  joke: 
“How  can  a  single  woman  get  a  cockroach  out 
of  her  kitchen?  Ask  him  for  a  commitment.” 

Some  analogical  jokes  depend  on  visual 
representations,  as  in  the  following  children’s 
joke:  “What  did  the  0  say  to  the  8?  Nice  belt.” 
This  joke  requires  a  surprising  visual  map¬ 
ping  between  nurherals  and  human  dress.  A 
more  risqud  visual  example  is.  “Did  you  hear 
about  the  man  with  five  penises?  His  pants 
fit  like  a  glove.”  Here  are  a  few  more  humor¬ 
ous  analogies: 

Safe  eating  is  like  safe  sex:  You  may  be 
eating  whatever  it  was  that  what  you’re 
eating  ate  before  you  ate  it. 

Changing  a  university  has  all  the  difficul¬ 
ties  of  moving  a  cemetery. 

The  juvenile  sea  squirt  wanders  through  the 
sea  searching  for  a  suitable  rock  or  hunk  of 
coral  to  cling  to  and  make  its  home  for  life. 
For  this  task,  it  has  a  rudimentary  nervous 
system.  When  it  finds  its  spot  and  takes 
root,  it  doesn’t  need  its  brain  anymore,  so 
it  eats  it!  (It’s  rather  like  getting  tenure.) 
(Dennett  1991,  p.  177) 

Bill  James  on  Tim  McCarver’s  book  on 
baseball:  “But  just  to  read  the  book  is  nearly 
impossible;  it’s  like  canoeing  across  Lake 
Molasses.” 

Red  Smith:  Telling  a  non-fan  about  base¬ 
ball  is  like  telling  an  8-year-old  about  sex. 
No  matter  what  you  say,  the  response  is 
“But  why?” 

In  all  these  cases,  there  is  an  analogical  map¬ 
ping  that  generates  surprise  and  amusement. 
In  the  emotional  coherence  theory  of 
Thagard  (forthcoming),  surprise  is  treated  as  a 
kind  of  metacoherence.  When  HOTCO  shifts 


from  coherent  interpretation  to  another,  with 
units  that  were  previously  activated  being  de¬ 
activated  and  vice  versa,  the  units  that  under¬ 
went  an  activation  shift  activate  a  surprise  node. 
In  analogical  jokes,  the  unusual  mapping  pro¬ 
duces  surprise  because  it  connects  together  el¬ 
ements  not  previously  mapped,  but  does  so  in 
a  way  that  is  still  highly  coherent.  The  combi¬ 
nation  of  activation  of  the  surprise  node,  the 
coherence  nbde,  and  other  emotions  generates 
humorous  amusement. 

Analogies  that  are  particularly  deep  and 
elegant  can  also  generate  an  emotion  similar  to 
that  produced  by  beauty.  A  beautiful  analogy 
is  one  so  accurate,  rich,  and  suggestive  that  it 
has  the  emotional  appeal  of  an  excellent  scien¬ 
tific  theory  or  mathematical  theorem.  Holy  oak 
and  Thagard  (1995,  ch.  8),  describe  important 
scientific  analogies  such  as  the  connection  with 
Malthusian  population  growth  that  inspired 
Darwin’s  theory  of  natural  selection.  Thus  sci¬ 
entific  and  other  elegant  analogies  can  gener¬ 
ate  positive  emotions  such  as  excitement  and 
joy  without  being  funny. 

Not  all  analogies  generate  positive  emo¬ 
tions,  however.  Ironies  are  sometimes  based  on 
analogy,  and  they  are  sometimes  amusing,  but 
they  can  also  produce  negative  emotions  such 
as  despair: 

HONG  KONG  (January  11,1 998  AF-P)  - 
Staff  of  Hong  Kong’s  ailing  Peregrine  In¬ 
vestments  Holdings  will  turn  up  for  work 
Monday  still  in  the  dark  over  the  fate  of 
the  firm  and  their  jobs.  .. . 

Other  Peregrine  staff  members  at  the  bro¬ 
kerage  were  quoted  as  saying  Sunday  they 
were  pessimistic  over  the  future  of  the  firm, 
saddled  with  an  estimated  400  million  dol¬ 
lars  in  debts. 

“I’m  going  to  see  the  Titanic  movie.. .that 
will  be  quite  ironic,  another  big  thing  go¬ 
ing  down,”  the  South  China  Morning  Post 
quoted  one  broker  as  saying, 

Shelley  (in  progress)  argues  that  irony  is  a 
matter  of  “bicoherence,”  with  two  situations 
being  perceived  as  both  coherent  and  incoher- 
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ent  with  each  other.  The  Peregrine  Investmcnts- 
Titanic  analogy  is  partly  a  matter  of  transfer¬ 
ring  the  emotion  of  despair  from  the  Titanic 
situation  to  the  company,  but  the  irony  gener¬ 
ates  an  additional  emotion  of  depressing  appro¬ 
priateness. 

The  final  category  of  emotion-generating 
analogies  I  want  to  discuss  is  motivational  ones, 
in  which  an  analogy  generates  positive  emo¬ 
tions  involved  in  inspiration  and  self-confi¬ 
dence.  Lockwood  and  Kunda  (forthcoming) 
have  described  how  people  use  role  models  as 
analogs  to  themselves,  in  order  to  suggest  new 
possibilities  for  what  they  can  accomplish.  For 
example,  an  athletic  African  American  boy 
might  sec  Michael  Jordan  as  someone  who  used 
his  athletic  ability  to  achieve  great  success.  By 
analogically  comparing  himself  to  Michael  Jor¬ 
dan,  the  boy  can  feel  good  about  his  chances  to 
accomplish  his  athletic  goals..  Adopting  a  role 
model  in  part  involves  transferring  emotions, 
e.g.  tran.sfcrring  the  positive  valence  of  the  role 
modePs  success  to  one’s  own  anticipated  suc¬ 
cess,  but  it  also  generates  new  emotions  accom¬ 
panying  the  drive  and  inspiration  to  pursue  the 
course  of  action  that  the  analogy  suggests.  The 
general  structure  of  the  analogical  inference  is: 

My  role  model  accomplished  the  goal  G 

by  doing  the  action  A, 

I  am  like  my  role  model  in  various  rc.spccts. 

So  maybe  I  can  do  A  to  accomplish  G! 

The  inference  that  I  may  have  the  abil¬ 
ity  to  do  A  can  generate  great  excitement  about 
the  prospect  of  such  an  accomplishment. 

In  this  paper,  I  have  provided  numerous  ex¬ 
amples  of  emotional  analogies  including:  analo¬ 
gies  about  emotions,  analogies  that  transfer  emo¬ 
tions  in  persuasion,  empathy,  and  self-explana¬ 
tion;  and  analogies  that  generate  emotions  in  hu¬ 
mor,  irony,  discoveiy,  and  motivation.  In  order  to 
understand  the  cognitive  processes  involved  in 
emotional  analogies,  I  have  proposed  an  account 
of  analogical  inference  as  defeasible,  holistic, 
multiple,  and  emotional.  The  HOTCO  model  of 
emotional  coherence  provides  a  computational 
account  of  the  interaction  of  cognitive  and  emo¬ 
tional  aspects  of  analogical  inference. 
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l.INTRODUCTION 

According  to  Heraclit  one  can  not  enter 
twice  the  same  river,  because  the  river  is  a  new 
one,  different  than  it  was  before.  This  is  why 
the  ancient  philosophers  proposed  the  concep¬ 
tion  of  panta  rcl.  It  is  not  only  the  psycholo¬ 
gists  who  could  say  that  a  need  for  seeking  a 
novelty  or  a  need  for  a  change  is  one  of  the 
central  human  desires.  The  external  control  and 
human  being  himself  or  herself  is  changing  its, 
his  or  her  state  into  the  new  one.  We  are  still 
coping  with  the  changing  environment.  Let  us 
consider  the  common  verbal  expressions,  which 
deal  with  the  verb  “new”.  The  dictionary  ex¬ 
pressions  refer  the  meaning  of  “new”  as  fol¬ 
lows:  -  recent  in  origin;  novel;  not  known  be¬ 
fore;  different;  unaccustomed;  fresh  after  any 
event;  not  second  hand  (see:  The  University 
English  Dictionary  edited  by  R.  F.  Patterson).  - 
not  existing  before;  lately  discovered  or  invent¬ 
ed;  recently  born  (sec:  Tlic  Family  Dictionaiy 
edited  by  Collins). 

-  different  /  a  whole  new  „ba1!  game”;  a 
separate  issue  or  matter  very  different  from  the 
matter  under  discussion;  a  new  situation  very 
different  from  the  present  one;  (sec:  English  id¬ 
ioms  edited  by  Oxford  University  Press  p.66) 

-  be  new  to  the  game  -  lack  of  experience 
in  an  activity,  job  or  situation  (p.l55) 

-  new  blood  -  someone  new  to  an  organi¬ 
sation,  job  or  work  who  is  expected  to  bring 
new  ideas,  innovations  (p.214) 

A  human  desire  to  search  for  the  new  word 
and  to  cope  with  novelty  can  be  seen  in  such 
verbal  expressions,  which  promoted  the  new 
streams  of  history,  discoveries  and  civilisations. 
The  examples  might  be: 

New  Style  -  a  chronological  term  to  demote 
dates  reckoned  by  the  Gregorian  calendar; 
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New  Deal  -  a  campaign  initiated  in  1933 
by  President  Franklin  Roosevelt  in  USA  involv¬ 
ing  a  complete  overhaul  of  American  econom¬ 
ic  life,  the  development  of  the  national  resourc¬ 
es  and  the  safeguarding  of  conditions  for  la¬ 
bour;  New  Learning  -  the  Renaissance; 

New  Testament  -  later  of  the  two  main  di¬ 
visions  of  the  Bible; 

New  World  -  North  and  South  America; 

The  above  expressions  denote  the  notion 
“new”  as  an  unknown  reality,  different  than  the 
well-known  and  experienced  before.  The  new 
situation  is  a  reality  which  is  in  question  because 
it  is  at  least  less  known.  The  question  is  how  to 
cope  with  the  new  reality  which,  on  one  hand, 
expected  to  be  reached,  and,  on  the  other  hand, 
is  risky  because  of  its  novelty  and  requires  deci¬ 
sion  which  way  to  go  and  how  to  “possess”  cog¬ 
nitively  the  current  stream  of  the  environment. 

2.  COGNITIVE  COPING  WITH  NEW 
SITUATION 

Cogito  ergo  sum  in  a  new  situation  requires 
to  cope  cognitively  with  unknown  and  uncer¬ 
tain  environment.  Cognitive  coping  with  new 
environment  assumes  to  employ  schema  of  in¬ 
ference  and  forecasting  not  only  to  survive  but 
also  to  develop  human  potentiality.  Generally 
speaking  there  are  also  two  schema’s  which 
could  be  used  to  cope  with  new  situations:  (I) 
deductive  reasoning  schema’s  based  on  logical 
implication  connection;  and  (2)  analogical  rea¬ 
soning  which  is  not  based  on  logical  implica- 
tional  foundation.  Unfortunately,  deductive 
inference  can  not  be  employed  in  many  new 
situations  where  general  premises  are  hardly  to 
be  formulated.  In  those  cases  analogical  rea¬ 
soning  can  be  only  employed  to  draw  conclu- 
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sions  about  the  situations  which  are  in  ques¬ 
tion.  This  is  why  analogical  inference  is  used 
to  be  called  as  reasoning  from  case  to  the  case. 
In  concluding  our  reflection  on  cognitive  cop¬ 
ing  with  new  situation  we  can  say  that  analogy 
can  be  recognised  as  cogito  which  leads  cop¬ 
ing  with  the  unknown  environment. 

Analogical  cogito  is  a  cognitive  “vehicle” 
for  searching  connections,  relations,  correspon¬ 
dence  between  the  new  domain  which  is  in 
question  and  the  well  known  domain.  Howev¬ 
er,  to  be  more  precise  we  should  state  that  anal¬ 
ogy  assumes  a  comparison  between  two  do¬ 
mains,  situations,  fields  or  areas  in  respect  with 
the  specific  relations.  Therefore  analogy  can  be 
interpreted  as  a  cognitive  schema  i.e.  a  kind  of 
mental  principle  or  human  mind’s  structural 
path  and  at  the  same  time  a  mental  vehicle 
which  drives  for  searching  relational  connec¬ 
tions  (or  correspondence)  within  the  considered 
domains  and  between  them. In  a  more  formal 
way  we  can  define  analogy  (Anal)  as  a  two  com¬ 
pound  complex  relation  expressed  as: 

(1) Anal  =  R(D,D’)or 

(2)  Anal  =  DRD’,  where 

D  -  a  well  known  domain,  situation; 

D’-  a  new  (less  known)  domain,  situation, 

i.e.  which  is  in  question.  The  relation  consid¬ 
ered  in  (1)  or  (2)  is  a  complex  one  because  its 
compounds  domains  D  and  D’  correspond  one 
to  another  with  respect  to  the  constituting  their 
relations,  R^  and  R’^  respectively : 

(3) D  =  R^(Xj,X2,...,Xj,...,x^) 

(4)  D’=  R’„(x’,,x’2,...,x*.,...,x’^),  where 

R^  -  a  base  relation  for  analogy  which  de¬ 
notes  that  the  known  domain  D  corresponds  to 
the  new  domain  D’  in  such  a  way  that  the  rela¬ 
tion  R^  constituting  the  D  fits  the  relation  R’^ 
constituting  the  D’; 

R’^  -  a  relation  constituting  the  new  domain 
D’  (which  is  in  question)  and  corresponding  to 
the  base  relation  for  analogy  R^^  constituting  the 
known  domain  D; 

XpX2,...,x.,...,x^  -  the  compounds  of  the  base 
relation  for  analogy  R^ 

x’j,x’2,...,x’.,...,x’^  -  the  compounds  of  the 
relation  R’^  corresponding  to  the  base  relation 
for  analogy  R^. 


After  completing  (3)  and  (4)  we  can  for¬ 
mally  define  analogy  in  a  more  complex  for¬ 
mula,  respectively: 

(5)  Anal  =  R[R^^(Xj,X2,...,x.,...,x 
R’„(x’,,x’2,...,x’,...,x’„)], 

(6) Anal  =  [R^(x,,X2 . Xj,...,x  J] 

R[R’„(x’j.x’2,...,x’.,...,x’^)]. 

Analogy  is  used  as  a  scheme  to  cope  with  a 
problem  in  a  new  domain,  situation.  This 
scheme  allows  to  formulate  two  premises  and 
to  draw  conclusion  concerning  the  unknown 
compound  of  the  base  relation  within  the  new 
situation: 

Premises: 

1 .  The  domain  D’  corresponds  to  the  do¬ 
main  D  with  respect  to  the  constituting  them 
relations  R’^  and  R^,  accordingly. 

2.  The  compound  x’.  of  the  relation  R’^^  is 
unknown  in  the  domain  D’  but  the  others  are 
known  and  fit  the  corresponding  compound,  of 
the  relation  R^^  in  the  domain  D. 

(7)  Conclusion:  The  x’.  is  like  x. . 

The  first  premise  of  the  scheme  (7)  states  a 
base  for  analogy,  i.e,  a  correspondence  between 
the  compared  domains  with  respect  to  the  ap¬ 
propriate  relations.  The  second  premise  says 
that  one  compound  of  the  relation  constituting 
the  domain  which  is  in  question,  is  unknown 
while  the  others  fit  the  corresponding  com¬ 
pounds  of  the  relation  constituting  the  known 
domain.  Therefore,  the  analogical  conclusion 
completes  the  correspondence  between  the  re¬ 
lations  R’^  and  R^  stating  that  the  unknown 
compound  x’.  has  found  its  corresponding  com¬ 
pound  X.. 

3.  COGNITIVE  VEHICLE 

Analogy  is  used  as  a  cognitive  vehicle  in 
science,  particularly  when: 

-  formulating  new  hypothesis; 

-  introducing  new  concepts;  • 

-  arguing  new  statements. 

Analogy  plays  also  a  role  of  a  cognitive 
path  in  economy,  politics  or  social  endeavour 
when  the  participants  of  economic,  political 
and  social  life  are  facing  problems  in  new  sit¬ 
uations  and  particularly  in  transformation  of 
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a  macrosystem  after  the  collapse  of  the  total¬ 
itarian  system  called  the  communism. 

Let  us  consider  now  more  general  state¬ 
ments  concerning  employing  analogy  as  a  cog¬ 
nitive  schema  in  the  proposition  of  solving 
problems  in  a  transformation  situation  in  Cen¬ 
tral  and  Eastern  European  countries. 

Generally,  one  can  say  that  D’  is  a  new 
unknown  situation,  when  he  or  she  is  facing 
problems  dealing  with: 

-  restructuring  of  centrally  managed  and 
state-owned  economy  into  a  private  sector 
which  better  fits  the  realities  of  free  market; 

-  legislation  which  deals  with  Parliament 
activity  and  then  with  executing  the  law  by  the 
government; 

-  building  democratic  infrastructure  which 
,  enables  citizens  to  participate  in  social  econom¬ 
ic  and  political  life. 

What  are  the  known  domains  D  which 
could  help  to  find  an  analogy  vehicle  to  search 
for  corresponding  schema’s,  structures,  meth¬ 
ods,  law  regulations,  institutions  that  are  ap¬ 
propriate  to  cope  with  the  actual  problems.  As 
the  potential  domains  to  look  for  economic, 
legislative  or  political  analogies  in  the  Polish 
transformation  situation,  could  be  considered 
as  situations  which  deal  with  market  economy 
experiences,  democratic  institutions  and  legis¬ 
lation  are: 

(a)  the  period  of  the  pre-II"'’  World  War 
Poland,  i.e.  since  1918  when  Poland  became 
an  independent  state  after  the  P'  World  War 
which  finished  the  partition  of  Poland  until 
September  the  1st  1939,  i.e.  the  beginning  of 
the  German  occupation  and  then  the  Soviet 
occupation; 

(b)  the  West  European,  American  or  Asi¬ 
atic  market  economy  institutions  and  solutions; 

(c)  the  democratic  and  free  market  econo¬ 
my  solutions  known  in  some  local  communities, 
regions  or  countries  which  could  be  treated  as 
leading  or  good  examples  of  macrosystem  trans¬ 
formation  in  post-communist  countries. 

Therefore  the  considered  analogies  are 
called  the  pre-war  Polish  analogies  or  the  con¬ 
temporary  West  European,  American  or  Asiat¬ 
ic  analogies.  As  far  as  the  content,  object  or  the 
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goal  of  the  analogical  schema  is  specified  we 
are  facing  the  defined  analogies:  legislative, 
institutional,  behavioural,  infrastructural,  or¬ 
ganisational  -  respectively.  If  the  known  domain 
(situation)  is  actually  existing,  the  appropriate 
leading  analogy  can  be  called  actual  or  contem¬ 
porary  (local,  regional,  domestic  or  foreign 
analogy).  If  the  known  situation  which  lends 
the  analogy  can  be  learned  only  from  the  past, 
the  appropriate  analogy  can  be  named  as  a  his¬ 
torical  one. 

Analogy  as  a  cognitive  vehicle  towards  new 
economic  behaviour,  new  market  economy 
institution,  new  legislative  solution  can  have  a 
strong  background  or  can  be  supported  by  some 
surface  or  superficial  base.  This  means  that  the 
base  relations  for  analogy,  i.e.  R^and  R’^  could 
be  substantial  or  accidental. 

Biela  (1993)  formulated  six  conditions  of 
analogical  correspondence  of  the  relations 
which  fulfilment  seems  to  be  relevant  to  the 
validating  inference  based  on  analogical  con¬ 
nection.  The  condition  related  to  the  substan¬ 
tives  of  the  base  relation  is  expressed  as  the 
constitutiveness  condition.  It  states  that,  if  do 
mains  D  and  D’  are  sufficiently  precisely  de¬ 
fined  and  the  relation:  R^  and  R'^  are  as  well, 
then,  according  to  the  available  level  of  scien¬ 
tific  knowledge,  the  existence  of  the  domain  D 
without  the  relation  R  and  the  existence  of  the 

I) 

domain  D’  without  the  relation  R’^  is  impossi¬ 
ble.  The  meaning  of  ..existence”  depends  here 
on  the  type  of  domain  in  question  and  the  ac¬ 
cepted  concept  of  the  domain  being  considered. 
For  example,  the  mode  of  existence  of  the  math¬ 
ematical  domain  depends  on  the  assumed  phi¬ 
losophy  of  mathematics.  The  same  is  true  of 
the  fine  art  or  music  domains  where  their  mode 
of  existence  depends  on  the  assumed  theory  of 
the  fine  art  work  or  music  composition.  The 
existence  of  the  natural  science  domains  de¬ 
pends  also  on  the  accepted  philosophy  under¬ 
lying  modem  theory  of  the  particular  discipline, 
e  g.  within  the  theory  of  physics  could  be  con¬ 
sidered  the  discussion  between  the  Duhemist 
and  Campbellian  approach  (sec:  Hesse,  1963). 

In  the  social  sciences  ..existence”  depends 
mainly  on  social  perception  of  the  relation 
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which  is  in  question.  If  we  create  the  sentential 
functions  f(D)  and  f(D’)  from  the  respective 
nominal  symbols  D  and  D’,  this  condition  might 
be  expressed  in  the  following  way  (the  condi¬ 
tion  of  constitutiveness): 

(8)  [R  f(D)]  n  [~R„  =»  ~f(D)]  and 

(9)  [R;=>  f(D’)]  n  hR;=»  ~f(D’)]. 

The  condition  emphasised  that  not  all  re¬ 
lations  recognised  within  the  domain  D  and 
D’  could  be  stated  as  the  base  for  analogy,  as 
rtiany  of  them  are  not  constitutive  for  these 
domains  (i.e.  it  is  still  possible  to  see  the  re¬ 
spective  domain  without  involving  many  var¬ 
ious  relations).  If  an  analogy  connection  was 
to  be  based  on  surface  relations  that  do  not 
fulfil  the  condition  (i.e.  ones  that  are  not  con¬ 
stitutive  for  the  domain  D  and  D’)  then  the 
inference  based  on  such  a  connection  will  not 
guarantee  any  valuable  result.  And,  moreover, 
such  kinds  of  surface  connections  are  haz¬ 
ardous  because  they  create  only  the  appear¬ 
ance  of  rational  thinking  by  analogy. 

If  the  condition  of  constitutiveness  is  con¬ 
sidered  in  the  applied  areas  of  social  sciences 
we  should  analyse  as  a  criterion  the  social  per¬ 
ception  of  distributive  justice  in  the  specific 
field  of  endeavour.  To  be  more  specific,  the 
social  perception  of  risk  and  benefits  analysis 
should  be  considered  with  respect  to  the  issue 
which  is  in  question.  Therefore,  as  far  as  eco¬ 
nomic  transformation  is  considered,  the  spe¬ 
cific  questions  are: 

1 .  What  are  the  risk  and  costs  of  the  trans¬ 
formation? 

2.  Who  is  the  beneficier  of  the  transforma¬ 
tion  in  post-communist  countries? 

3.  Who  is  taking  risk  and  paying  the  main 
cost  of  this  process? 

4.EXAMPLES  OF  ANALOGIES 

Let  us  consider  some  examples  of  analo¬ 
gies  employed  in  macrosystem  transforma¬ 
tion  time  in  Poland. 

First  group  of  transformational  analo¬ 
gies  are  the  coping  mechanisms  which  em¬ 
ploy  some  surface  behavioural  or  institution¬ 
al  analogies. 


Conserving  old  structures  under  a  new  coat 
of  paint 

A  frequent  coping  strategy  is  to  hold  and 
to  conserve  old  organisations,  institutions  and 
structures  while  attempting  to  adopt  them  to 
new  political  and  constitutional  circumstanc¬ 
es.  The  adaptation al  level  here  is  very  superfi¬ 
cial.  This  kind  of  adaptation  could  be  metaphor¬ 
ically  described  as  „painting  over  a  heavily  rust¬ 
ed  car”.  It  is  a  case  where  the  old  political  par¬ 
ty,  central  economic  institutions,  and  local 
municipal  governments  want  to  survive  under 
the  new  political  and  economic  circumstances. 
Therefore,  they  decide  to  make  some  cosmetic 
changes  such  as  a  new  name,  a  surface  reor¬ 
ganisation,  minimal  reduction  of  employees, 
changing  leaders,  etc.,  without  any  serious  in¬ 
tention  to  change  the  deep  structure  of  the  in¬ 
stitution  or  reformulate  its  goal  and  function  in 
the  new  environment. 

The  behavioural,  institutional  and  organi¬ 
sational  analogy  is  based  on  the  conserved  be¬ 
havioural  patterns  and  institutions  learned  and 
structured  during  the  centrally  managed  econ¬ 
omy  period.  This  is  a  conservative  analogy 
which  really  avoids  serious  transformation. 

Constructing  new  institutions  according  to 
old  patterns 

Another  form  of  „surface”  mental  adap¬ 
tation  is  constructing  a  new  alternative,  and 
formally  independent  institution  according  to 
old  patterns.  This  sounds  paradoxical,  but  of¬ 
ten  these  old  patterns  were  criticised  just  by 
the  people  who  form  new  institutions.  These 
patterns  deal  mainly  with  monopolistic  and  to¬ 
talitarian  -  centralist  mind  -  sets.  This  happens 
quite  often  in  newly  installed  political  parties, 
new  local  administrative  centres,  central  gov¬ 
ernmental  institutions,  etc.  The  point  here  is 
that  people  are  not  able  to  behave  in  a  new 
way,  even  if  they  create  a  new  institution. 
Analogy  here  also  is  based  on  behavioural  and 
organisational  patterns  learned  in  the  climate 
of  totalitarian  mentality.  This  is  also  a  con¬ 
servative  analogy  which  secures  a  continua¬ 
tion  of  the  mental  climate  in  the  new  political 
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circumstances.  Some  critical  time  is  needed 
to  change  the  old  patterns  in  people’s  behav¬ 
iour,  It  requires  to  change  the  base  relation  in 
analogical  reasoning. 

Ersatr.  standards  of  freedom  and  high  living 

Another  coping  mechanism  is  to  find  some 
available  evidence  of  freedom  or  a  high  stan¬ 
dard  of  living  that  could  play  the  role  of  an  er¬ 
satz,  substituting  for  real  freedom  or  a  real  im¬ 
provement  in  the  standard  of  living.  Examples 
of  substitutes  playing  such  a  role  could  include 
unusual  goods  like  Western-style  clothes  (even 
if  second-hand),  used  Western  cars  (even  if  rust¬ 
ed),  or  Western  style  sex-shops  and  sex-maga¬ 
zines.  These  substitutes  create  an  illusory  at¬ 
mosphere  that  the  desired  changes  have  already 
succeeded.  The  cognitive  mechanism  leading 
such  behaviours  can  be  called  as  an  ersatz  anal¬ 
ogy  which  is  based  on  very  superficial,  easily 
available  and  of  immediate  gratification  behav¬ 
ioural  effect.  This  kind  of  analogy  allows  easi¬ 
ly  to  achieve  a  sense  of  an  illusory  participa¬ 
tion  in  a  Western  high  standard  of  living.  Fac¬ 
ing  political  changes,  some  people  prefer  im¬ 
mediate  gratification  instead  of  waiting  for  the 
long-term,  delayed  effects  of  the  changes.  Such 
people  prefer  having  ersatzes,  which  can  be 
achieved  in  a  short  time  instead  of  the  real  de¬ 
sired  changes  themselves.  Ersatz  analogies 
touch  surface  and  superficial  relations  of  the 
Western  life  which  can  not  be  stated  as  a  reali¬ 
ty  of  the  market  economy  world.  Unfortunate¬ 
ly  they  create  an  illu.sory  atmosphere  that  the 
desired  transformation  related  with  Western 
democracy  and  free  market  economy  have  al¬ 
ready  succeeded. 

5.  LEGISLATIVE  ANALOGY 

A  good  example  of  using  historical  analo¬ 
gy  which  employs  the  experiences  of  the  pre¬ 
war  Poland,  is  an  initiative  of  restitution  of  i.e. 
Prokuratoria  Generalna  which  is  the  institution 
control  ling  the  managing  of  the  State  Treasure. 
Let  us  state  the  background  of  the  Prokuratoria 
Generalna  legislation  analogy. 


(a)  The  kno^^^l  situation  (S). 

The  known  situation  (S)  is  here  the  pre-war 
period  when  the  institution  of  the  Prokuratoria 
Generalna  was  introduced  by  the  legislation  of 
the  Polish  Sejm  in  1919.  The  legislation  was 
initiated  at  the  very  beginning  of  the  IT  Polish 
Republic  by  the  Parliament.  The  architects  of 
the  Polish  state  believed  that  building  of  mar¬ 
ket  economy  required  an  institution  of  a  very 
high  professional  and  moral  authority  to  con¬ 
trol  the  efficiency  of  managing  of  the  national 
treasure  resources.  The  pre-war  Polish  econo¬ 
my  reached  significant  development  in  terms 
of  its  potentiality,  level  of  investment,  stock 
market  infrastructure,  macroeconomic  indica¬ 
tors.  The  pre-war  Polish  Prokuratoria  General- 
na  functioned  efficiently  and  reached  a  high 
professional  prestige  and  moral  authority. 

(b)  The  new  situation  (S’). 

The  designers  of  the  Polish  macroeconomic 
transformation  after  a  collapse  of  communism 
are  facing  difficulties  in  building  market  econ¬ 
omy.  However,  the  bigger  problem  is  more  how 
to  restructure  the  state-owned  enterprises  into 
the  private  companies  which  can  cope  with 
market  reality  and  in  the  same  time  fit  the  Pol¬ 
ish  economy  long-term  benefits  perspective. 
Building  the  market  economy  in  Poland  based 
on  privatisation  requires  the  institution  to  con¬ 
trol  the  process  at  the  very  beginning  of  build¬ 
ing  market  economy  in  the  III  Polish  Republic. 
Therefore  the  Polish  Sejm  at  the  very  begin¬ 
ning  of  its  cadence  articulates  a  legislative 
initiative  for  the  Prokuratoria  Generalna  which 
resembles  the  pre-war  institution  of  the  same 
name.  Moreover,  the  legislative  proposal  of 
March  1998,  in  the  Article  I  refers  to  the  pre¬ 
war  tradition  of  this  kind  of  institution. 

6.  PRIVATISATION  ANALOGIES 

The  process  of  transformation  in  post-com¬ 
munist  countries  drives  towards  changing  the 
ownership  status  of  the  state-owned  enterpris¬ 
es  into  the  private  entities.  However  the  prob¬ 
lem  is  who  should  be  the  owner  of  the  enter¬ 
prises  and  which  model  of  privatisation  to 
choose.  The  Polish  way  of  privatisation  em- 
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ploys  a  variety  of  models  which  are  based  on 
analogies  drawn  from  comparing  the  West  Eu¬ 
ropean  or  the  USA  private  companies  with  the 
Polish  companies  of  the  same  branch.  The  pro¬ 
posed  models  deal  with  such  forms  of  privati¬ 
sation  as: 

-  capital  privatisation; 

-  Employee  Stock  Ownership  Plan  (ESOP); 

-  leasing  form. 

The  mentioned  above  forms  of  private 
companies  are  the  well  known  domains  which 
inspire  to  think  by  analogy  about  possible 
ownership  transformation  of  the  related  Pol¬ 
ish  companies.  However,  it  is  obvious  that 
the  macroeconomic,  social  and  political  en¬ 
vironment  of  the' Western  companies  is  hard¬ 
ly  similar  to  the  Polish  ones.  Moreover  the 
risk  of  the  Polish  transformation  is  in  it  that 
the  changes  are  sudden  and  in  a  large  scale 
what  was  never  in  the  Western  world  the  case 
where  the  development  of  private  companies 
took  years.  Nevertheless,  analogies  are  the 
mental  bridges  which  lead  to  solving  the  Pol¬ 
ish  problems  of  privatisation. 

The  Western  models  of  the  private  owner¬ 
ship  can  not  be  applied  literally  into  the  Polish 
situation.  They  can  be  employed  partially.  Let 
us  consider  the  example  on  the  American 
ESOP.  For  the  same  Polish  companies  the 
ESOP  analogy  became  a  direct  model  for  pri¬ 
vatisation.  However,  the  Polish  privatisation 
requires  more  extensive  model  for  a  large-scale 
privatisation  where  the  participants  will  be  the 
Polish  citizens  whose  insufficiently  paid  work 
was  accumulated  into  the  investment  capital. 
This  is  why  they  have  a  right  to  participate  in  a 
privatisation  of  the  state-owned  companies. 
This  kind  of  privatisation  is  called  in  Poland 
the  Program  Powszechnego  Uw3aszczenia 
what  might  be  translated  as  the  Citizens  Own¬ 
ership  Program  (see:  Sejm  print  No  400).  This 
program  gives  a  chance  to  the  Polish  citizens 
to  participate  in  the  ownership  and  play  an  ac¬ 


tive  role  in  the  allocation  of  the  investment  cap¬ 
ital.  The  program  uses  the  instruments  and  in¬ 
stitutions  of  the  stock  market.  The  intention  of 
this  program  is,  among  others,  to  concentrate 
the  local  and  the  regional  capital  within  the 
Local  Mutual  Investment  Funds.  The  idea  of 
such  capital  institutions  were  drawn  by  analo¬ 
gy  to  (a)  the  Western  Mutual  Investment  Funds, 
and  to  (b)  the  pre-war  Polish  local  Saving  Co¬ 
operatives  (called  the  Kasy  Stefczyka  -  from 
the  name  of  their  promoter). 

7.  REMARKS 

Analogy  is  the  most  intriguing  cognitive 
principle  which  allows  to  draw  conclusion,  par¬ 
ticularly  in  new  areas,  domains  and  situations. 
However,  the  value  of  the  drawn  analogical  con¬ 
clusion  depends  on  the  relation  which  is  called  a 
base  for  analogy.  In  other  words,  analogical  rea¬ 
soning  might  be  founded  on  surface,  superficial 
base  relations  or  on  substantial  background. 

Building  new  economy  and  democratic  in¬ 
stitutions  after  a  collapse  of  totalitarian  sys¬ 
tem  requires  not  only  a  mental  adaptation  into 
a  new  situation  but  shaping  and  restructuring 
the  situation  which  is  in  question.  Analogical 
reasoning  plays  an  important  role  both  in  men¬ 
tal  adaptation  and  in  shaping  new  situation. 
However,  the  participants  of  transformation 
use  more  or  less  sophisticated  analogies  in 
coping  with  new  political,  social  and  economic 
environment. 
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Significant  progress  has  been  made  in  cre¬ 
ating  cognitive  simulations  that  model  a  vari¬ 
ety  of  phenomena  in  analogy,  similarity,  and 
retrieval  [1  ],  To  date,  most  models  have  fo¬ 
cused  on  exploring  the  fundamental  phenome¬ 
na  involved  in  matching,  inference,  and  retriev¬ 
al.  While  there  is  still  much  to  be  discovered 
about  these  areas,  the  time  seems  right  for  more 
energy  to  be  focused  on  simulating  the  roles 
analogy  places  in  larger-scale  cognitive  process¬ 
es:  what  I  call  large-scale  analogical  process¬ 
ing. 

Psychological  evidence  suggests  that  struc¬ 
tural  alignment  plays  a  central  role  in  many 
cognitive  processes  [c.f.  2,3 ,4,5].  An  impor¬ 
tant  challenge  for  cognitive  simulations  is  that 
they  be  capable  of  modeling  the  same  breadth 
of  phenomena.  Exploring  these  issues  requires 
moving  beyond  simulating  isolated  modules 
and  working  in  toy  domains  to  creating  larger- 
scale  simulations  that  model  a  wider  range  of 
cognitive  phenomena.  In  addition  to  cognitive 
modeling,  we  believe  that  the  state  of  the  art  in 
analogical  processing  has  advanced  to  the  point 
where  it  can  be  used  to  create  fundamentally 
new  kinds  of  applications. 

This  talk  describes  two  examples  of  how 
we  are  using  cognitive  simulation  to  explore 
the  roles  of  structure-mapping  \6  ]  in  large-scale 
analogical  processing.  We  use  the  Structure- 
Mapping  Engine  (SME)  [7 ,8 ,9  ]  to  model  the 
comparison  process  that  underlies  analogy  and 
similarity,  and  MAC/FAC  [10,11  ]  to  model 
similarity-based  retrieval.  The  examples  are: 

•A  design  coach  for  students  learning  engi¬ 
neering  thermodynamics  that  is  accessible  via 
email  [12  ].  Students  use  CyclePad,  an  articu¬ 


late  virtual  laboratory  [131  for  engineering  ther¬ 
modynamics,  to  create  designs  for  power  plants, 
refrigerators,  and  other  systems.  A  built-in 
email  facility  enables  them  to  ask  for  help  from 
an  automatic  server,  including  advice  on  im¬ 
proving  their  designs.  The  design  coach  uses 
MAC/FAC  to  retrieve  cases  and  uses  SMF  to 
create  advice  showing  how  the  transformation 
in  the  case  can  be  tailored  to  a  student’s  de¬ 
sign.  By  using  SMF  and  MAC/FAC  and  our 
tools,  human  domain  experts  can  add  cases  to 
the  library  without  hand-coding  representations 
or  handindexing  them  for  retrieval, 

•An  account  of  mental  models  we  are  de¬ 
veloping  to  help  explain  common  sense  reason¬ 
ing  about  the  physical  world  [14].  Two  com¬ 
mon  explanations  for  qualitative  mental  mod¬ 
els  are  high -resolution  imagery  and  first-prin¬ 
ciples  reasoning  from  general  domain  theories. 
We  propose  instead  similarity-based  qualita¬ 
tive  simulation  as  a  psychologically  plausible 
mechanism  for  common  sense  prediction  tasks. 
Similarity-based  qualitative  simulation  uses 
analogical  retrieval  and  mapping  of  qualitative 
representations  to  make  predictions  in  novel 
situations  based  on  previously  experienced  be¬ 
haviors. 
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Similarity,  metaphor  and  analogy  are  funda¬ 
mental  mechanisms  of  learning.  In  this  research  I 
suggest  a  unified  framework  of  structural  align¬ 
ment  between  situations  or  domains  that  high¬ 
lights  common  structure  and  allows  further  prop¬ 
erties  to  be  projected.  This  structure-mapping 
framework  suggests  notions  of  structural  consis¬ 
tency,  systematicity  and  candidate  inferences  that 
offer  new  insights  into  how  comparison  is  used 
to  perceive  commonalities  and  differences,  project 
inferences  and  derive  new  abstractions. 

One  advantage  of  this  framework  is  that 
it  allows  us  to  model  extended  metaphors  that 
map  large-scale  belief  systems.  In  one  series 
of  studies,  we  tested  whether  extended  meta¬ 
phors  are  processed  as  mappings  from  one 
conceptual  system  to  another  (Centner  &  Bo- 
ronat,  1992;  in  preparation).  We  gave  partic¬ 
ipants  a  series  of  consistent  metaphoric  state¬ 
ments  from  one  domain  (the  base)  to  another 
(the  target);  they  read  these  statements  one 
at  a  time  on  a  computer  screen.  Half  the  sub¬ 
jects  were  in  the  consistent  condition,  and  re¬ 
ceived  a  metaphor  that  remained  consistent 
throughout  the  passage.  The  other  half  re¬ 
ceived  a  different  metaphor,  so  that  the  map¬ 
ping  shifted  at  the  last  sentence, 
e.g.. 

CONSISTENT  MAPPING  [mind  as  knife 
-  mind  as  knife] 

“...After  just  three  hours  she  had  lost  her 
edge... 

Her  mind  was  too  dulled  with  fatigue  for 
her  to  think  well.” 

INCONSISTENT  MAPPING  [mind  as  en¬ 
gine  —  mind  as  knife] 

“...After  just  three  hours  she  had  run  out  of 
steam... 

Her  mind  was  too  dulled  with  fatigue  for 
her  to  think  well.” 


As  predicted  by  the  domain-mapping  hy¬ 
pothesis,  participants  were  slower  to  read  the 
last  sentence  when  there  was  a  shift  in  the  un¬ 
derlying  mapping. 

However,  this  was  only  true  for  novel  met¬ 
aphoric  phrases.  The  processing  of  conventional 
metaphoric  phrases  was  not  disturbed  by  the 
shift  in  mapping.  This  finding  would  be  pre¬ 
dicted  by  Bowdle  and  Gentnerfs  (in  prepara¬ 
tion)  career  of  metaphor  hypothesis,  that  meta¬ 
phors  arc  initially  processed  as  mappings,  but 
eventually  become  processed  as  lexical  word 
senses  (Sec  also  Gentncr  &  Wolff,  1997).  The 
implication  of  this  finding  is  that  structure¬ 
mapping  processes  arc  used  to  understand  novel 
metaphors,  and  further  that  these  processes  can 
sen^e  to  create  new  word  meanings. 

I  suggest  that  alignment  and  mapping  pro¬ 
cesses  are  a  major  force  in  human  learning  and 
development.  Analogical  mapping  promotes 
learning  and  conceptual  change  in  three  ways; 
by  inviting  inferences  from  one  situation  to  the 
other,  by  promoting  schema  abstraction  across 
the  two  situations,  and  by  prompting  re-repre¬ 
sentation  ofoneorboth  situations.  I  will  present 
evidence  from  studies  of  children  and  adults  to 
show  that  comparison  processes  are  a  major 
mechanism  of  spontaneous  learning  and  a  nat¬ 
ural  route  towards  abstract  systems  of  under¬ 
standing. 

In  summary,  my  thesis  is  that  analogical 
thinking  is  fundamental  to  human  cognition. 
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ABSTRACT 

This  paper  presents  several  challenges  to 
the  models  of  analogy-making,  namely  the  need 
for  building  integrated  models,  the  need  for  us¬ 
ing  dynamic  and  emergent  representations,  the 
need  for  using  dynamic  and  emergent  compu¬ 
tation,  and  the  need  to  integrate  analogy-mak¬ 
ing  with  Other  cognitive  processes.  Some  ex¬ 
perimental  data  are  reviewed  which  substanti¬ 
ate  these  needs  and  the  main  ideas  how  the 
AMBR  model  of  analogy-making  could  meet 
these  challenges  are  presented. 

1.  FROM  THE  ANATOMY  TOWARDS 
THE  PHYSIOLOGY  OF  ANALOGY- 
MAKING;  THE  NEED  FOR 
INTEGRATED  AND  DYNAMIC  MODELS 

For  a  long  time  now  the  research  on  analo¬ 
gy  has  concentrated  on  the  anatomy  of  analo¬ 
gy-making,  i.e.  on  decomposing  it  into  pieces 
(representation  building,  retrieval,  mapping, 
transfer,  evaluation,  learning)  and  trying  to  un¬ 
derstand  how  each  individual  piece  works.  A 
number  of  successful  models  of  various  sub¬ 
processes  (mainly  of  mapping  and  retrieval) 
have  been  built  which  account  for  most  of  the 
psychological  data  and  make  useful  predictions: 
SME  and  MAC/FAC  (Centner,  1983,  Falken- 
heiner,  Forbus,  Centner,  1986,  Forbus,  Cent¬ 


ner,  Law,  1995),  ACME  and  ARCS  (Holyoak, 
Thagard,  1989,  Thagard,  Holyoak,  Nelson, 
Gochfeld,  1990,  Holyoak,  Thagard,  1995),  lAM 
(Keane,  Ledgew'ay,  Duff,  1994),  etc. 

The  big  challenge  in  modeling  analogy¬ 
making  (and  human  cognition  in  general)  is  to 
move  on  from  the  atomistic  and  analytical  ap 
proach  of  Democritus  (469-370  BC)  towards 
the  holistic  and  intcractionist  approach  of  Her¬ 
aclitus  (544-481  BC),  i.e.  to  start  building  inte¬ 
grated  models  of  the  phenomenon  as  a  whole. 
These  models  should  unite  contraries  and  ac¬ 
count  for  data  arising  from  the  interaction  be¬ 
tween  subprocesses,  which  cannot  be  explained 
by  an  isolated  model  of  a  subprocess.  Such 
models  arc  gradually  emerging.  Tlius  the  Copy- 
Cat  and  TableTop  models  (Hofstadter,  1995, 
Mitchell,  1993,  French,  1995)  integrate  repre¬ 
sentation  building  with  mapping  and  transfer, 
LISA  (Hummel  and  Holyoak,  1997)  integrates 
access,  mapping,  transfer,  and  learning.  AMBR 
(Kokinov,  1988, 1994c)  integrates  access,  map 
ping,  and  transfer. 

Heraclitus  took  the  view  that  “Everything 
flows,  cveiylhing  changes”,  i.e.  the  dynamics  of 
change  is  more  important  and  informative  than 
static  objects  and  slates.  Tliis  is  the  next  chal¬ 
lenge  to  the  current  models:  they  should  explain 
and  predict  not  only  the  outcomes  of  the  analo¬ 
gy-making  process  but  also  its  dynamics.  Un¬ 
fortunately,  only  scare  data  is  available  on  the 
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dynamics  of  the  process.  This  means  that  such 
data  will  have  to  be  gathered  by  using  experi¬ 
mental  paradigms  extensively  used  in  other  do¬ 
mains,  for  example,  on-line  experiments  mea¬ 
suring  reaction  times,  analysing  thinking-aloud 
protocols,  etc.  These  methods  have  already  been 
used  in  analogy  research  but  to  a  very  limited 
extent  (Ross  and  Sofka,  1 986,  Keane,  Ledgeway, 
and  Duff,  1994,  Schunn  and  Dunbar,  1996). 

There  are  already  experimental  data  which 
support  the  existence  of  interaction  effects  be¬ 
tween  the  subprocesses  of  analogy-making. 
Thus  Keane,  Ledgeway,  and  Duff  (1994)  have 
demonstrated  a  very  strong  ordering  effect,  i.e. 
effect  of  the  order  of  presentation  of  the  target 
problem  elements  on  the  response  time  for  Solv¬ 
ing  the  problem.  Thus  in  the  “singleton-first” 
condition  subjects  found  the  mapping  twice  as 
fast  as  subjects  in  the  “singleton-last”  condi¬ 
tion.  These  data  can  be  considered  as  evidence 
for  the  interaction  between  perceptual  and  map¬ 
ping  processes.  It  would  be  even  more  interest¬ 
ing  to  find  the  reverse  patterns:  the  mapping 
already  established  facilitating  the  perception 
of  certain  elements. 

The  analysis  of  thinking-aloud  protocols 
done  by  Ross  and  Sofka  (1986)  revealed  that 
the  retrieval  of  various  elements  of  the  source 
domain  is  interrelated  with  the  mapping  be¬ 
tween  the  two  domains,  i.e.  the  already  estab¬ 
lished  mappings  guide  the  retrieval  of  specific 
source  elements.  These  data  cannot  be  ex¬ 
plained  by  a  serial  model  of  analogy-making 
where  first  the  source  is  being  retrieved  and  then 
the  source  and  target  are  mapped.  An  exten¬ 
sive  discussion  of  this  phenomenon  and  its 
modeling  in  AMBR  as  well  as  simulation  data 
obtained  with  AMBR  can  be  found  in  (Petrov, 
Kokinov,  this  volume).  AMBR  predicts  also  the 
reverse  influence:  the  specific  order  of  retriev¬ 
al  of  elements  of  the  source  domain  will  facil¬ 
itate  certain  mappings.  As  a  result  of  these  in¬ 
teractions,  a  pattern  of  retrieval  has  been  dem¬ 
onstrated  where  initially  one  source  domain 
looks  more  promising  and  is  better  retrieved 
based  on  the  greater  superficial  similarity,  but 
as  soon  as  mapping  starts  (in  parallel  to  the 
continuing  retrieval  of  domain  elements),  the 


higher  structural  correspondence  between  a  sec¬ 
ond  source  domain  and  the  target  and  the  es¬ 
tablished  mappings  make  it  possible  for  the 
second  domain  to  be  ultimately  better  retrieved 
and  mapped  which  would  be  impossible  if  the 
retrieval  and  mapping  were  sequential  isolated 
and  irreversible  processes. 

Finally,  a  study  currently  underway  in¬ 
volves  video  recording  of  subjects  solving  a 
formatting  task  on  a  computer  screen.  The  vid¬ 
eo  protocols  demonstrate  a  complex  interaction 
between  perceiving  elements  on  the  screen  (in¬ 
cluding  figure/background  perception),  retriev¬ 
ing  elements  from  memory,  mapping  between 
these  elements,  and  performing  actions  on  the 
screen,  the  results  of  which  are  further  perceived 
and  mapped  to  expectations. 

The  explanation  of  all  these  data  requires 
models  which  abandon  the  serial  type  of  pro¬ 
cessing  and  which  move  on  towards  parallel 
processing  which  will  allow  the  various  sub¬ 
processes  to  interact  dynamically  with  each 
other.  AMBR  is  one  such  model  that  is  based 
on  the  highly  parallel  cognitive  architecture 
DUAL  (Kokinov,  1994a,  1994b).  All  process¬ 
es  in  AMBR  are  running  in  parallel  and  inter¬ 
acting  with  each  other.  Moreover,  as  described 
in  section  3,  each  of  these  subprocesses  emerg¬ 
es  from  the  collective  behavior  of  many  micro¬ 
agents  and  thus  is  also  inherently  parallel.  Since 
the  micro-agents  are  taking  part  in  various  sub¬ 
processes  there  are  no  clear-cut  boundaries  be¬ 
tween  the  various  processes  themselves. 

Before  the  dynamics  of  computation  in 
AMBR  can  be  presented,  the  need  for  dynam¬ 
ic  representations  that  will  change  in  the 
course  of  analogy-making  will  be  discussed 
in  the  next  section. 

2.  FROM  PRINTED  TEXT  TOWARDS 
MOVING  PICTURE:  THE  NEED  FOR 
DYNAMIC  AND  EMERGENT 
REPRESENTATIONS 

A  printed  text  is  a  static  representational 
object  while  a  moving  picture  is  a  dynamic  rep¬ 
resentation  which  emerges  from  the  continu¬ 
ously  changing  frames.  Moreover,  this  dynam- 
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ic  representation  does  not  exist  physically  (only 
the  static  frames  exist  physically),  it  exists  only 
in  our  consciousness.  Analogously,  memory 
traces  may  be  considered  either  as  physically 
existing  static  entities,  or  as  emergent  phenom¬ 
ena  which  are  constructed  in  our  consciousness. 

From  the  very  beginning  of  memory  re¬ 
search  the  view  of  memory  as  consisting  of  sta¬ 
ble  representations  has  been  under  fire.  Thus 
Bartlett  (1932)  has  shown  that  episodes  are 
grouped  into  schemas  and  their  representations 
are  systematically  shifted  or  changed  in  order 
to  fit  these  schemas.  Research  on  autobiograph¬ 
ical  memoty  has  provided  evidence  that  peo¬ 
ple  modify  their  memories  by  dropping  ele¬ 
ments  (schematising),  including  new  elements 
(filling  in),  replacing  elements  (distorting),  etc. 
Loftus  (1977,  1979)  has  convincingly  demon¬ 
strated  a  number  of  interference  effects.  One 
example  involves  subjects  looking  at  a  movie 
where  a  blue  car  docs  not  stop  at  the  site  of  an 
accident.  Later  on  in  a  questionnaire  a  number 
of  questions  are  asked  about  a  different  green 
car.  As  a  result,  when  asked  about  the  color  of 
the  car  which  did  not  stop,  subjects  are  quite 
confident  that  it  was  green.  In  another  study 
subjects  claim  they  have  seen  broken  glass  in  a 
car  crash  whereas  there  was  no  broken  glass  in 
the  movie  shown  to  them. 

Neisser  and  Harsch  (1992)  have  demon¬ 
strated  that  the  so-called  “flash-bulb  memory” 
does  not  exist  but  that  descriptions  constructed 
by  human  memory  are  so  vivid  that  people 
strongly  believe  they  are  true.  One  day  after  the 
Challenger  accident  they  asked  subjects  to  tell 
them  (and  write  down)  how  they  learnt  about 
the  accident:  whether  they  heard  it  on  the  ra- 


Figure  h  Centralized  and  frozen  representations  of 
episodes  in  LTM. 


dio,  or  saw  it  on  TV,  or  learnt  it  on  the  street,  in 
the  supermarket,  from  friends.  They  asked  fur¬ 
ther  the  subjects  in  the  study  what  they  were 
doing  when  they  learnt  about  the  accident,  what 
their  reactions  were,  etc.  One  year  later  the  ex¬ 
perimenters  asked  the  same  subjects  whether 
they  still  remember  the  accident  and  how  they 
learnt  about  it.  People  claimed  they  had  very 
vivid  (“flash-bulb”)  memories  about  ever>'  sin¬ 
gle  detail  and  they  started  to  tell  the  experiment¬ 
ers  a  very  different  story  from  the  one  they  told 
before.  Even  after  the  experimenters  showed 
them  their  own  writings  they  could  not  believe 
that  the  new  story  they  were  telling  the  experi¬ 
menters  was  not  true. 

Although  it  has  long  been  demonstrated 
that  human  memory  is  a  (re)constnictive  de¬ 
vice  rather  than  a  store  of  stable  memory  traces 
from  our  past,  models  of  analogy-making  tend 
to  ignore  that  fact.  Typically  these  models 
would  have  a  collection  of  representations  of 
past  episodes  (prepared  by  the  author  of  the 
model)  “stored”  in  long-term  memory  (LTM), 
one  or  more  of  which  would  be  “retrieved” 
during  the  problem  solving  process  and  would 
serve  as  a  base  (or  source)  for  analogy.  The  very 
idea  of  having  singular  centralized  and  frozen 
representations  of  base  episodes  is  at  least  ques¬ 
tionable,  but  it  underlies  most  analogy-making 
models,  and  certainly  all  case-based  reasoning 
systems  (Figure  1). 

Research  on  retrieval  in  analogy-making  has 
concentrated  on  how  people  select  the  most  ap¬ 
propriate  episode  from  the  vast  set  of  episodes 
in  LTM.  It  has  been  established  that  the  exist¬ 
ence  of  similar  objects,  properties  or  relations  in 
the  two  domains  is  the  crucial  factor  for  retriev¬ 
al  (Holyoak  &  Koh,  1987,  Ross,  1989)  and  that 
is  why  remote  analogies  are  very  rare.  On  the 
other  hand,  structural  similarities  can  also  facil¬ 
itate  retrieval  under  certain  circumstances,  when 
there  is  a  general  similarity  between  the  domains 
or  story  lines  (Ross,  1989,  Wharton,  Holyoak, 
Lange,  1996).  There  is  not  much  research  either 
on  the  dynamics  of  the  process  of  retrieving  (or 
constructing),  or  on  how  complete  the  resulting 
descriptions  of  the  episodes  arc. 

A  recently  conducted  experiment  was  de- 
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signed  as  a  replication  of  Holyoak  and  Koh’s 
(1987)  Experiment  1.  However,  a  thinking-aloud 
method  was  used.  Subjects  discussed  the  solution 
of  the  radiation  problem  in  a  class  on  thinking 
within  an  introductory  Cognitive  Science  course. 
From  3  to  7  days  later  they  were  invited  by  differ¬ 
ent  experimenters  to  participate  in  a  problem-solv¬ 
ing  session  in  an  experimental  lab.  They  had  to 
solve  the  light  bulb  problem.  Almost  all  subjects 
(except  one  who  turned  out  not  to  have  attended 
the  class  discussing  the  tumor  problem)  constmct- 
ed  the  convergence  solution  and  explicitly  (in  most 
cases)  or  implicitly  made  analogies  with  the  radi¬ 
ation  problem.  We  were  interested  how  complete 
and  correct  their  spontaneous  descriptions  of  the 
tumor  problem  story  were.  It  turned  out  that  re¬ 
membering  the  radiation  problem  is  not  an  all-or- 
nothing  case.  Different  statements  from  the  story 
were  recollected  and  used  with  varying  frequen¬ 
cy.  Thus  the  application  of  several  X-rays  on  the 
tumor  was  explicitly  mentioned  by  75%  of  the  16 
subjects  participating  in  the  experiment,  the  state¬ 
ment  that  high  intensity  rays  will  destroy  the 
healthy  tissue  was  mentioned  by  66%  of  the  sub¬ 
jects,  while  the  statement  that  low  intensity  rays 
will  not  destroy  the  tumor  was  mentioned  only 
by  25%.  Finally,  no  one  mentioned  that  the  pa¬ 
tient  would  die  if  the  tumor  was  not  destroyed. 
All  this  demonstrates  a  partial  retrieval  of  the  base: 
which  elements  of  the  base  will  be  retrieved  de¬ 
pends  on  the  pragmatically  important  aspects  of 
the  target  problem. 

On  the  other  hand,  there  were  some  inser¬ 
tions,  i.e.  “recollections’*  of  statements  that  were 
never  made  explicit  in  the  source  domain  de¬ 
scription.  Thus  one  subject  said  that  the  doctor 
was  an  oncologist  which  was  never  explicated 
in  the  radiation  problem  description  (nor  should 
it  be  necessarily  true).  Another  subject  claimed 
that  the  tumor  had  to  be  burnt  off  by  the  rays, 
which  was  also  never  formulated  in  that  way  in 
the  problem  description. 

Finally,  there  were  borrowings  from  other 
possible  bases  in  memory:  thus  one  subject  said 
that  the  tumor  had  to  be  “operated  by  laser 
beams”  while  in  the  base  story  the  operation  was 
even  forbidden.  Such  blendings  were  very  fre¬ 
quent  between  the  base  and  the  target,  thus  7  out 


of  the  11  subjects  spontaneously  re-telling  the 
base  (the  radiation)  story  were  mistakenly  using 
laser  beams  instead  of  X-rays  to  destroy  the  tu¬ 
mor.  This  blending  is  evidently  the  result  of  the 
correspondence  established  between  the  two  el¬ 
ements  and  their  high  similarity. 

In  summary,  the  experiment  has  shown  that 
remindings  about  the  base  story  are  not  all-or- 
nothing  events  and  that  subjects  make  omissions, 
insertions,  and  blendings  with  other  episodes. 

The  representation  of  episodes  in  AMBR 
is  de-centralized,  which  means  that  separate 
elements  of  the  episode’s  description  are  rep¬ 
resented  by  separate  memory  elements  (called 
micro-agents  in  the  DUAL  cognitive  architec¬ 
ture).  Thus  the  episode  as  a  whole  is  represent¬ 
ed  by  a  coalition  of  agents,  but  there  is  no  guar¬ 
antee  that  the  whole  coalition  will  be  activated 
and  become  part  of  WM.  Depending  on  the 
weights  of  the  links  between  the  agents  the  co¬ 
alition  might  be  looser  or  tighter.  This  makes  it 
possible  to  model  the  above  mentioned  psycho¬ 
logical  effects.  Thus  very  often  only  part  of  the 
agents  in  a  coalition  are  being  activated  above 
the  Working  Memory  (WM)  threshold  and  thus 
the  corresponding  episode  is  only  partially  re¬ 
trieved.  Depending  on  the  retrieval  cues  used 
various  partial  recollections  will  be  produced. 

Blendings  also  happen  in  AMBR.  Thus 
agents  representing  aspects  of  several  different 
episodes  can  be  concurrently  activated  in  WM. 
Mappings  between  elements  of  the  target  and  el¬ 
ements  of  all  partially  retrieved  episodes  can  be 
established  in  parallel  and  compete  with  each  oth- 


Figure  2.  Blending  of  two  episodes  (represented  by  two 
coalitions)  which  are  partially  retrieved  in  WM  and 
partially  mapped  on  the  target  coalition.  (The  target 
coalition  is  also  part  of  WM,  but  is  depicted  separately 
for  simplicity  of  the  diagram). 
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er.  Typically  the  support  that  the  agents  in  one 
coalition  receive  from  each  other  is  enough  to 
achieve  a  global  emergent  “winner^’  episode. 
However,  in  some  cases  one  or  more  aspects  need¬ 
ed  for  the  mapping  (having  counterparts  in  the 
target)  are  missing  in  the  representation  of  an  ep¬ 
isode,  or  are  not  retrieved  in  WM,  but  instead  cor¬ 
responding  elements  from  other  episodes  arc  re¬ 
trieved.  In  such  a  case  a  blending  between  the 
episodes  can  happen,  i.e.  the  target  elements  are 
partially  mapped  to  elements  of  one  base  and  par¬ 
tially  to  elements  of  another  base  (Figure  2). 

Finally,  insertions  (analogous  to  the  doc¬ 
tor-oncologist  case)  arc  also  possible  in  AMBR. 
Semantic  knowledge  is  represented  in  a  simi¬ 
lar  decentralized  fa.shion,  i.e.  different  aspects 
of  a  concept  are  represented  by  different  agents. 
Suppose,  for  example,  that  there  is  a  general 
rule  saying  that  liquids  are  typically  held  in 
containers.  Suppose  now  that  an  episode  is  be¬ 
ing  retrieved  in  which  water  is  heated  by  an 
immersion  heater.  It  might  well  be  the  case  that 
the  fact  that  the  water  was  in  a  glass  was  either 
not  encoded  at  all,  or  was  not  retrieved  under 
the  current  circumstances.  At  the  same  time  the 
target  situation  involves  tea  being  heated  in  a 
pot  on  a  plate.  The  agent  representing  the  fact 
that  the  tea  is  in  the  pot  will  activate  many 
agents  representing  similar  facts  and  in  partic¬ 
ular  the  one  representing  liquids  being  in  con¬ 
tainers.  If  during  the  mapping  process  a  corre¬ 
spondence  is  attempted  between  those  agents: 
IN(TEA1 ,  POTl )  and  IN(I  JQUID,  CONTAIN¬ 
ER),  then  instead  of  building  a  correspondence 
hypothesis  between  them,  a  new  agent  is  being 
built  which  represents  a  skolemized  version  of 
the  general  statement,  namely  INfWATERl, 
CONTAINER  1 )  and  a  correspondence  hypoth¬ 
esis  between  it  and  IN(TEA1,  POTl)  will  be 
formed.  In  this  way  the  mapping  process  guid¬ 
ed  the  process  of  extending  the  representation 
of  the  old  episode,  thus  producing  a  new  richer 
representation  with  inclusions,  such  as 
IN(WATER  1 ,  CONTAINER  1 ). 

In  summary,  AMBR  dynamically  forms  the 
representation  of  old  episodes  by  selecting  only 
some  of  the  encoded  aspects  of  the  episode 
(hopefully  the  relevant  ones),  and  by  adding  new 


aspects  which  have  not  been  explicitly  encoded 
from  beforehand  -  this  is  done  either  as  skolem- 
ized  versions  of  more  general  facts,  or  by  bor¬ 
rowing  facts  from  other  episodes  (blending). 

The  specific  mechanisms  proposed  in 
AMBR  for  re-representation  of  old  episodes 
might  be  psychologically  valid  or  not,  but  the 
very  fact  that  such  dynamic  rc-rcprcscniations 
arc  being  made  by  humans  has  been  shown  to 
be  valid  above.  Another  important  aspect  is  that 
this  re-representation  in  AMBR  is  a  result  of  the 
interplay  of  memory  retrieval  (determining 
which  agents  will  be  brought  into  WM),  map¬ 
ping  (determining  which  agents  are  unpaired), 
and  deductive  reasoning  (skolemization)  and 
could  not  be  realised  if  they  were  not  running  in 
parallel  and  interacting  with  each  other.  Finally, 
as  I  will  discuss  in  the  next  section,  all  these  com¬ 
plicated  processes  of  re-representation  and  map¬ 
ping  are  performed  using  only  local  information, 
i.e.  each  individual  agent  decides  which  links  to 
establish,  which  new  agents  to  form,  etc. 

3.  FROM  CENTRALIZED  PLANNING 
TOWARDS  FREE  MARKET:  THE  NEED 

FOR  DYNAMIC  AND  EMERGENT 
COMPUTATION 

Adam  Smith  is  not  only  the  most  famous 
economist  who  introduced  the  theory  of  the 
free  market  as  a  regulator  of  the  economy  and 
was  against  any  form  of  governmental  con¬ 
trol  over  the  market.  In  his  book  “An  Inquiry 
into  the  Nature  and  Causes  of  the  Wealth  of 
Nations"  (Smith,  1 776)  he  also  introduced  the 
idea  of  emergent  phenomena  in  the  social  sci¬ 
ences.  He  wrote  about  “the  invisible  hand  by 
which  man  is  led  to  promote  an  end  which  was 
not  part  of  his  intention".  Thus  when  some¬ 
one  decides  to  start  the  production  of  certain 
goods  in  an  area  where  the  rate  of  profit  is  very 
high  he/she  does  it  in  order  to  gain  this  high 
profit,  however,  since  many  will  do  the  same, 
this  will  result  in  declining  prices  and  eventu¬ 
ally  decreasing  the  rate  of  profit  in  this  area 
which  was  in  no  way  a  goal  of  the  producers, 
but  they  have  achieved  it  by  their  actions.  Von 
Hayek  (1967),  anotherfamous  economist,  pro- 
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claimed  that  finding  an  explanation  of  the 
mechanisms  of  these  emergent  phenomena  is 
the  main  task  of  the  social  sciences:  “those 
unintended  patterns  and  regularities  which  we 
find  to  exist  in  human  society  and  which  it  is 
the  task  of  social  thbory  to  explain”. 

Some  human  societies  were  tempted  to  find 
a  more  direct  and  faster  way  to  achieve  a  bal¬ 
ance  in  their  economy  -  why  wait  till  the  free 
market  regulates  prices  and  production  when 
the  government  could  calculate  the  desired  pric¬ 
es  and  amounts  of  production  in  every  econom¬ 
ic  area  and  directly  postulate  them.  These  at¬ 
tempts  have  recently  collapsed  completely. 
Why?  The  problem  is  that  economic  systems 
are  too  complex  to  be  directly  controlled  and 
what  seems  to  be  “the  more  efficient  direct 
way”  is  actually  a  very  rigid  way  that  cannot  be 
flexible  enough  to  reflect  dynamic  changes  in 
the  environment. 

Cognitive  scientists  are  gradually  learning 
the  same  lesson.  The  attempts  to  build  a  model 
of  human  cognition  based  on  a  centralized  con- 
f  trol  system  are  doomed  to  failure.  No  such  sys¬ 
tem  could  be  flexible  enough  to  adapt  to  all 
dynamic  changes  in  the  environment  and  to 
reflect  all  possible  human  goals.  Such  a  system 
is  inherently  rigid  as  it  reflects  the  tasks  and 
circumstances  envisaged  by  its  designer.  An 
alternative  approach  has  been  proposed  by 
Marvin  Minsky  (1983)  which  is  based  exactly 
on  the  analogy  with  human  societies  and  has 
been  called  “the  society  of  mind”.  Another  al¬ 
ternative  is  the  connectionist  approach  based 
on  the  analogy  with  human  neural  networks. 


Nevertheless,  we  are  still  trying  to  build 
models  of  analogy-making  based  on  the  as¬ 
sumption  that  the  solution  of  a  problem  is  de¬ 
termined  by  its  formulation^nd  the  knowledge 
background  (including  previous  solutions  to 
other  problems)  the  subject  has.  Several  exam¬ 
ples  of  context  effects  are  presented  here  which 
demonstrate  that  analogy-making  is  not  that 
simple  and  predictable. 

Kokinov  and  Yoveva  (1996)  conducted  an 
experiment  on  problem  solving  where  seeming¬ 
ly  irrelevant  elements  of  the  problem  solver’s 
environment  were  manipulated.  The  material 


Figure  4.  Illustrations  accompanying  the  irrelevant 
problems  in  the  various  experimental  conditions. 
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manipulated  consisted  of  drawings  accompany¬ 
ing  other  problems  which  happened  to  be  print¬ 
ed  on  the  same  sheet  of  paper.  There  was  no  re¬ 
lation  between  the  problems  and  the  subject  did 
not  have  to  solve  the  second  problem.  However, 
these  seemingly  irrelevant  pictures  proved  to  play 
a  role  in  the  problem  solving  process  as  we  ob¬ 
tained  different  results  with  different  drawings. 
We  used  Clement’s  spring  problem: 

“Two  springs  are  made  of  the  same  steel 
wire  and  have  the  same  number  of  coils.  They 
differ  only  in  the  diameters  of  the  coils.  Which 
spring  would  stretch  further  down  if  we  hang 
the  same  weights  on  both  of  them?” 

The  problem  description  was  accompanied 
by  Figure  3 . 

In  different  experimental  conditions  the  draw¬ 
ings  used  as  accompanying  a  second  unrelated 
problem  on  the  same  sheet  of  paper  were  differ¬ 
ent:  a  comb,  a  bent  comb,  and  a  beam  (Figure  4). 

The  results  obtained  in  these  experimen¬ 
tal  conditions  differed  significantly  (at  the  0.01 
and  0.001  levels):  in  the  control  condition  (no 
second  picture  on  the  same  sheet  of  paper)  about 
half  of  the  subjects  decided  that  the  first  spring 
will  stretch  more  and  the  other  half  ‘voted*  for 
the  second  one,  with  only  a  few  saying  they 
will  stretch  equally.  In  the  comb  condition  con¬ 
siderably  more  subjects  suggested  that  the  first 
spring  will  stretch  more,  in  the  bent  comb  con¬ 
dition  considerably  more  subjects  preferred  the 
second  spring,  and  in  the  beam  condition  more 
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Figure  5.  Percentage  of  proponed  answer  tn  all  the 
experimental  conditions. 


subjects  than  usual  decided  that  both  springs 
will  stretch  equally  (Figure  5). 

In  a  more  recent  study  (the  thinking-aloud 
experiment  described  in  section  2)  the  subjects 
who  had  to  solve  the  lightbulb  problem  were 
divided  into  two  groups.  In  the  control  group 
there  were  no  other  problems  on  the  sheet  of 
paper,  in  the  context  group  the  following  prob¬ 
lem  was  presented  on  the  same  sheet. 

“The  voting  results  from  the  parli  amenta - 
ly  elections  in  a  faraway  country  have  been 
depicted  in  the  following  pie-chart.  Would  it 
be  possible  for  the  largest  and  the  smallest  par¬ 
ties  to  form  a  coalition  which  will  have  more 
than  2/3  of  the  scats?” 

The  results  are  the  following:  in  the  con¬ 
text  group  all  1  subjects  who  produced  the  con¬ 
vergence  solution  to  the  lightbulb  problem  used 
three  laser  beams  (7:0),  while  in  the  control 
group  two  subjects  said  they  would  use  two  or 
three  beams  and  the  rest  said  they  would  u.se 
cither  tn^o  or  several  beams  (2:5).  The  differ¬ 
ence  is  significant  at  the  0.0 1  level. 

The  results  from  both  experiments  demon¬ 
strate  that  sometimes  small  changes  of  a  seem¬ 
ingly  arbitrary  clement  of  the  environment  can 
radically  change  the  outcomes  of  the  problem 
solving  process  (can  block  it,  or  guide  it  into  a 
specific  direction).'  Such  phenomena  arc  called 
“catastrophes”.  It  would  be  very  difficult  to 
account  for  such  effects  by  a  mc^el  based  on 
centralized  control  because  in  order  to  do  so 


Figure  6.  tUustrafion  accompanying  the  context 
problem. 
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the  centralized  processor  would  have  to  pro¬ 
cess  all  possible  stimuli  in  the  perceptual  field 
and  to  check  whether  they  can  be  involved  in 
the  problem  solving  process,  which  would  be 
inefficient  and  time-demanding  to  such  an  ex¬ 
tent  as  to  make  it  impossible. 

The  AMBR  model  adopts  the  following 
approach  to  accounting  for  context  effects. 
It  assumes  that  different  micro-agents  pro¬ 
cess  different  aspects  of  the  problem  and  the 
environment.  If  it  happens  that  one  agent  pro¬ 
cessing  an  arbitrary  and  seemingly  irrelevant 
visual  stimulus  enters  an  interaction  with  a 
second  agent  processing  a  relevant  problem 
aspect,  then  the  first  agent  will  be  addition¬ 
ally  activated  and  become  more  relevant  and 
thus  involved  in  the  collective  process  of 
problem  solving  performed  by  the  society  of 
agents.  This  is  a  very  brief  and  simplified 
description  of  what  happens  in  the  model,  a 
detailed  description  would  be  based  on  the 
specific  mechanisms  of  spreading  activation, 
marker  passing,  link  establishment  and  be- 
tween-agent  communication  which  are  too 
complicated  to  be  outlined  in  the  limited 
space  of  this  article. 

Another  important  aspect  of  analogy¬ 
making  which  makes  it  difficult  to  predict 
whether  the  subject  will  be  able  to  spontane¬ 
ously  find  an  analogous  base  (which  we  know 
he/she  knows)  is  that  this  process  depends  on 
his/her  preliminary  internal  state  which  is 
typically  not  related  to  the  current  problem, 
but  is  related  to  recently  performed  activi¬ 
ties.  Thus  Kokinov  (1990)  demonstrated 
priming  effects  on  analogical  problem  solv¬ 
ing  (as  well  as  on  other  types  of  reasoning) 
which  have  a  very  dynamic  nature,  namely 
they  are  very  powerful  immediately  after  the 
priming  event  and  decrease  in  the  course  of 
time  and  eventually  disappear  after  a  short 
period  of  time  (in  this  particular  study  with¬ 
in  a  period  of  about  25  minutes).  These  prim¬ 
ing  effects  have  been  qualitatively  repro¬ 
duced  by  a  previous  version  of  the  AMBR 
model  based  on  the  pre-adtivation  of  certain 
agents  and  the  decay  of  their  activation  in  the 
course  of  time  (Kokinov,  1994c).  We  plan  to 


reproduce  these  priming  effects  with  the  new 
version  by  running  it  continuously  thus  solv¬ 
ing  various  problems  one  after  another. 

The  main  conclusion  from  the  consider¬ 
ations  in  this  section  is  that  in  order  to  build 
adequate  models  of  analogy-making,  we  need 
to  base  them  on  massively  parallel  architec¬ 
tures  allowing  the  parallel  work  and  interac¬ 
tion  of  many  small  processing  entities.  In  ad¬ 
dition  the  architecture  should  allow  for  dynam¬ 
ic  short-term  changes  in  the  structure  of  inter¬ 
actions  between  these  entities,  something  that 
current  connectionist  models  do  not  allow. 

AMBR  and  the  underlying  cognitive  ar¬ 
chitecture  DUAL  are  definitely  not  the  best 
solution  to  these  requirements.  For  example, 
top-down  pressure  (“the  invisible  hand’*  of 
the  context)  is  limited  to  the  current  distri¬ 
bution  of  activation  over  agents  which  facil¬ 
itates  the  local  communication  between 
agents  in  one  direction  and  inhibits  it  in  an¬ 
other,  supports  certain  coalitions  of  agents 
and  suppresses  others.  It  is  doubtful  that  this 
would  be  enough  to  explain  all  context  and 
priming  effects.  Qn  the  other  hand,  CopyCat 
and  TableTop  have  one  additional  top-down 
pressure  which  is  called  “temperature”  and 
reflects  an  internal  evaluation  of  the  mental 
state  and  how  close  the  system  is  to  the  solu¬ 
tion  of  the  problem.  A  problem  with  this  ap¬ 
proach  is  that  it  assumes  the  existence  of  a 
centralized  agent  watching  the  whole  situa¬ 
tion,  computing  the  temperature  and  then 
communicating  it  back  to  all  agents  -  this 
resembles  again  centralized  “government” 
control,  although  it  is  weak  control  -  it  does 
not  specify  what  the  agents  should  do,  but 
only  changes  their  biases  and  thresholds. 

The  next  question  to  be  discussed  in  the 
last  section  is  whether  the  mechanisms  per¬ 
forming  analogy-making  can  be  considered 
domain-specific  and  thus  form  something  that 
several  researchers  have  called  an  analogy¬ 
making  engine. 


*  This  is  analogous  to  the  following  phenomenon  in  econ¬ 
omy  -  the  bankruptcy  of  a  single  bank  can  trigger  off  a  chain 
of  bankruptcies  and  eventually  a  global  financial  crisis. 


103 


Bolcho  Kokinnv 


4.  FROM  A  SPECIALISED  ENGINE 
TOWARDS  AN  EMERGENT 
PHENOMENON:  INTEGRATING 
ANALOGY  WITH  OTHER  COGNITIVE 
PROCESSES 

If  analogy-making  is  modeled  within  a 
highly  parallel  architecture  of  “the  society  of 
mind*’  type,  then  there  is  no  need  to  assume 
that  there  are  mechanisms  or  agents  which  are 
so  specific  that  are  solely  used  for  analogy¬ 
making.  On  the  contrary,  the  analogy-making 
process  would  be  considered  as  an  emergent 
phenomenon,  i.e.  that  is  how  we  describe  cer¬ 
tain  types  of  emergent  behavior  produced  by 
the  society  of  agents.  AMBR,  for  example,  uses 
mechanisms  like  spreading  activation,  marker 
passing,  etc.  which  in  no  way  may  be  consid¬ 
ered  as  specific  for  analogy-making.  Spread¬ 
ing  activation,  in  particular,  is  involved  in  all 
memory  processes;  marker  passing  is  involved 
in  the  processes  of  evaluating  semantic  simi¬ 
larity,  categorization,  directed  search,  property 
inheritance,  etc,  A  process  that  might  seem 
more  specific  for  analogy-making  is  the  ability 
of  agents  to  establish  hypotheses  for  structure 
correspondence  (i.e.  correspondence  between 
substructures),  however,  this  process  seems  so 
fundamental  that  it  is  doubtful  that  it  is  specif¬ 
ically  designed  for  analogy-making  -  all  pro¬ 
cesses  of  perception  would  need  some  struc¬ 
ture  correspondence  abilities,  all  relational  pro¬ 
cessing  would  also  require  this  ability. 

If  we  subscribe  to  the  “emergent  phenom¬ 
enon”  view  on  analogy,  then  it  would  be  natu¬ 
ral  to  integrate  it  with  all  other  cognitive  pro¬ 
cesses  -  simply  they  are  emerging  from  the 
collective  behavior  of  the  same  micro-agents. 
Then  the  boundaries  between  analogy-making, 
perception,  memory,  deductive  reasoning,  etc. 
can  be  described  as  conventional  -  as  classifi¬ 
cation  of  various  types  of  collective  behavior 
of  the  same  set  of  agents  and  produced  by  the 
same  mechanisms  (probably  in  different  pro¬ 
portions).  Thus  Kokinov  (1988,  1990,  1994c) 
has  argued  that  the  boundaries  between  analo¬ 
gy,  deduction  and  generalization  are  a  conven¬ 
tion  and  that  these  processes  are  implemented 


by  the  same  mechanisms.  Of  course,  this  is  yet 
only  one  unsuKstantiated  hypothesis. 

This  paper  is  probably  too  general  and  full 
of  speculation,  however,  its  purpose  has  been 
neither  to  describe  AMBR  in  details  (which  is 
not  possible  because  of  space  limitations),  nor 
to  defend  its  basic  principles.  I  am  fully  aware 
of  the  fact  that  these  principles  express  only  one 
possible  point  of  view  on  modeling  analogy¬ 
making.  The  purpose  is  to  present  some  chal¬ 
lenges  to  current  models  of  analogy-making  as 
seen  by  the  author  and  to  suggest  possible  ways 
of  meeting  them  hoping  to  combine  these  ideas 
with  other  views  expressed  during  the  workshop. 
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ABSTRACT 

Diagrams  often  use  repetition  to  convey 
points  and  establish  contrasts.  This  paper  shows 
how  MAGI,  our  model  of  repetition  and  sym¬ 
metry  detection,  can  model  the  cognitive  pro¬ 
cesses  humans  use  when  reading  repetition- 
based  diagrams.  MAGI,  which  is  based  on  the 
Structure  Mapping  Engine,  detects  repetition 
by  aligning  both  visual  and  conceptual  relation¬ 
al  structure.This  lets  visual  regularity  of  form 
support  an  understanding  of  the  conceptual  reg¬ 
ularity  such  forms  often  depict.  We  describe 
JUXTA,  which  uses  this  insight  to  critique  a 
class  of  diagrams  that  juxtapose  similar  scenes 
to  demonstrate  physical  laws. 

INTRODUCTION 

In  explanatory  diagrams,  repeated  struc¬ 
tures  often  have  special  significance.  To  under¬ 
score  a  point  or  emphasize  a  difference,  dia¬ 
grams  often  juxtapose  events,  scenes,  or  objects. 
Examples  include  a  “before  and  after”  display 
of  shirts  in  a  laundry  detergent  ad  and  a  point- 
by-point  comparison  of  pumps  in  a  physics  text. 
In  such  cases, visual  repetition  heightens  con¬ 
trasts  and  encourages  deeper  comparisons.  This 
effect  is  an  instance  of  what  we  have  termed 
analogical  encoding  (Ferguson,  1994),  because 
it  uses  repetition  and  symmetry  detection  to 
support  other  reasoning  processes. 

Diagram  designers  have  long  known  the 
utility  of  repetition.  Edward  Tufte  writes  that 
repeating  structure  “takes  advantage  of  our  no¬ 
table  capacity  to  compare  and  reason  about 
multiple  images  that  appear  simultaneously- 
within  our  eyespan.  We  are  able  to  canvas,  iden¬ 


tify,  reconnoiter,  select,contrast,  review — ways 
of  seeing  quickened  and  sharpened  by  thedi- 
rect  spatial  adjacency  of  parallel  elements.” 
(Tufte,  1997,  p.  80).  Repetition,  detectable  at  a 
glance,  aids  the  reader  in  exploring,  and  thus 
understanding,  a  diagram. 

An  example  illustrates  this  point.Figure  1 
is  from  a  solar  energy  text  (Buckley,  1 979).  This 
diagram  illustrates  a  principle  of  heat  transfer 
by  juxtaposing  two  scenarios.  In  these  scenari¬ 
os,  heat  flows  from  a  hot  liquid,  along  an  im¬ 
mersed  metal  bar,  to  amelting  ice  cube.  Because 
heat  flows  faster  in  the  leftmost  scene,  its  ice 
cube  melts  more  quickly.  This  difference  be¬ 
tween  the  scenarios  shows  how  increasing  a 
conductor’s  cross  area  increases  heat  transfer. 


Thick  Bar  Conducts  More  Heat 


Figure  1.  A  diagram  from  Buckley (1979). 
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The  diagram  uses  repetition  to  good  ef¬ 
fect.  The  two  scenarios  not  only  contain  the 
same  physical  elements,  but  are  also  visual¬ 
ly  similar.  Before  understanding  the  process¬ 
es  or  the  physical  objects,  the  diagram  read¬ 
er  may  sense  this  visual  “echo”,  which  di¬ 
vides  the  diagram  into  two  parts.  This  divi¬ 
sion  signals  the  reader  thatthese  two  parts  are 
to  be  compared.  Then,  the  visual  correspon¬ 
dence  of  similar  shapes  supports  the  concep¬ 
tual  correspondence  of  the  two  cups,  two 
bars,  and  two  heat  flows  that  are  key  to  un¬ 
derstanding  the  point  in  the  caption. 

If  the  designer  had  arranged  the  two  scenes 
to  be  similar  in  conceptual  but  not  visual 
terms — if,  for  example,  the  cup  and  icecube 
were  shaped  or  arranged  differently — the  read¬ 
er  could  still  understand  the  diagram.  But  she 
might  not  instantly  recognize  the  implicit  com¬ 
parison,  as  before.  The  diagram’svisual  repe¬ 
tition  allows  its  conceptual  comparison  to  be 
quickly  grasped. 

This  diagram  is  also  designed  so  that  all 
differences  are  relevant.  The  sole  differenc¬ 
es  in  the  diagram  are  the  greater  thickness  of 
the  left  metal  bar,  and  the  greater  volume  of 
water  dripping  from  the  left  ice  cube.  These 
differences  are  tied  to  point  of  the  caption: 
“Thick  bar  conducts  more  heat.”  The  thicker 
bar  is  the  independent  variable,  and  the  in¬ 
creased  melting  visibly  indicates  the  greater 
heat  flow. 

Other  differences  could  have  been  al¬ 
lowed.  The  cups  could  differ  in  volume  or 
height,  or  the  metal  bars  could  differ  not  Just 
in  thickness,  but  in  length.  Intuitively,  how¬ 
ever,  such  differences  would  make  the  dia¬ 
gram  less  clear.  As  Tufte  notes,” 
[ijnformation  consists  of  differences  that 
make  a  difference.”  (1997,  p.  65) 

The  two  repetition  based  techniques  used 
by  this  diagram — using  visual  regularity  sup¬ 
port  a  conceptual  comparison,  and  limiting 
differences  to  only  those  relevant  to  the  dia¬ 
gram’s  point — are  our  starting  point  for  a 
cognitive  model  of  how  humans  comprehend 
repetition  in  diagrams. 

no 


STRUCTURAL  ALIGNMENT 

PROCESSES  IN  DIAGRAMMATIC 
REASONING 

Why  should  visual  repetition  aid  diagram 
comprehension?  How  docs  difference  contrib¬ 
ute  to  understanding?  We  believe  the  answers 
may  lie  in  structure-mapping  processes. 

Our  explanation  involves  two  models.  The 
first,  MAGI,  is  a  model  of  repetition  and  sym¬ 
metry  detection  which  links  regularity  detec¬ 
tion  with  analogical  mapping.  The  second, 
Markman  and  Centner’s  aUgnahIc  difference 
model,  show  difference  detection  depends  on 
structural  alignment.  Based  on  these  two  mod¬ 
els,  we  describe  three  diagram  design  defects 
that  occur  inrepetition-based  diagrams. 

MAGI 

Similarity  and  analogical  comparison  can 
be  modeled  as  the  structural  alignment  of 
propositional  descriptions.(Falkenhainer,  For- 
bus,  &  Centner,  1989;  Forbus, Ferguson,  & 
Centner,  1994;  Centner,  1983;  Centner,  1989; 
Goldstonc,  1994;Holyoak  &  Thagard,  1989; 
Keane  &  Braysbaw,  1988). 

MACI  (Ferguson,  1994,  In  preparation) 
isthe  first  model  linking  regularity  detection 
with  similarity.  MAGI  is  basedon  the  idea 
that  symmetry  and  repetition  (both  visual  and 
conceptual)  can  be  viewed  as  asimilarity 
mapping  between  a  description  and  itself. 
Using  an  extension  of  the  Structure  Mapping 
Engine  (SME;  MAGI  uses  structural  align¬ 
ment  to  detect  regularity  within  a  single  de¬ 
scription.  Like  SME,  MACI’s  mapping  pro¬ 
cess  Is  computationally  tractable  because  it 
operates  in  a  local -to-global  fashion. Individ¬ 
ual  alignments  are  constructed  in  parallel  and 
then  aggregated  into  global  mappings,  map¬ 
pings  governed  by  systematicity  constraints 
favoring  relationally  deep,  interconnected 
correspondence  sets.  MACI  also  operates  in¬ 
crementally.  As  new  information  is  added  to 
a  description,  MACI’s  mapping  can  be  ex¬ 
tended  appropriately. 
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To  detect  regularity,  MAGI  maps  over  a 
visual  representation  built  by  Geo  Rep.  Geo 
Rep,  given  a  line  drawing,  builds  a  propositional 
description  of  its  salient  perceptual  relations. 
Starting  with  the  drawing’s  graphical  primitives 
(line  segments,  arcs,  circles,  ellipses,  and  spline 
curves),  a  set  of  visual  routines  (Ullman,  1984) 
represent  a  variety  ofrelationships,  including 
types  of  object  connection,  parallelisms, hori¬ 
zontal  and  vertical  relations,  and  descriptions 
of  polygons  and  their  inflexion  points.  Geo  Rep 
contains  a  rule  engine,  and  its  default  rule  set 
can  be  extended  to  handleparticular  domains. 

Given  a  stylized  line  drawing  of  our  exam¬ 
ple  diagram  (Figure  2),  MAGI  can  map  over 
the  diagrarri’s  perceptual  relations  to  determine 
object  correspondences  figure  3.  If  we  add  in¬ 
formation  about  the  physical  objects  and  pro- 
cessesin  the  diagram,  MAGI  can  extend  its  map¬ 
ping  accordingly. 

MAGI  canhelps  explain  the  immediacy  and 
utility  of  visual  regularity.  It  describe  show  rep¬ 
etition  is  detected  and  the  nature  of  the  corre¬ 
spondences  produced.  More  importantly,  how¬ 
ever,  it  provides  a  link  between  perceptual  and 
conceptual  regularity. 

Based  on  MAGI’ s  model,  we  assume  the 
reader  of  the  diagram  begins  by  detecting  the 
its  visual  regularity  (Figure  3).  As  conceptual 
information  is  also  acquired,  the  reader  may 
attempt  to  use  this  information  to  extend  the 
mapping.  However,  if  the  new  conceptual  infor¬ 
mation  cannot  be  mapped  consistently  with  the- 
visual  information,  the  reader  may  either  fail 
to  notice  the  conceptual  regularity,  or  need  to 


ignore  the  previous  visual  regularity.  Handling 
this  conflict  may  slow  or  blockdiagram  com¬ 
prehension. 

Visual  repetition  and  symmetry  detection 
operate  very  early  inperception.  Visual  sym¬ 
metry  can  be  detected  after  display  times  of 
lessthan  100  ms.  (Carmody,  Nodine,  &  Loch- 
er,  1977;  Corballis  &Roldan,  1975;  Julesz, 
1971).  Consequently,  most  models  of  symme¬ 
try  detection  do  not  incorporate  more  complex 
algorithms  such  asstructural  alignment  (with 
the  notable  exception  of  the  Wageman’s  Boot¬ 
strapping  model  (1995)).  Until  recently,  it 
seemed  unlikely  thiat  visual  symmetry  detec¬ 
tion  couldinvolve  alignment. 

However,  new  results  from  Aminoff,  Fer¬ 
guson  and  Gentner(In  preparation;  1996)  pro¬ 
vide  evidence  that  even  the  earliest  forms  of 
symmetry  detection  may  involve  alignment. 
In  two  experiments,  Aminoff  et  al.  (in  prepa¬ 
ration)  showed  subjects  symmetric  and  asym¬ 
metric  polygons  with  display  times  of  50  ms. 
In  each  experiment,  subjects  were  consistent¬ 
ly  faster  or  more  accurate  at  judging  the  asym¬ 
metry  of  polygons  contain  ingaligned  qualita¬ 
tive  differences,including  differences  in  cor¬ 
ner  concavity  and  number  of  vertices.  This 
effect  was  independent  of  several  other  quan¬ 
titative  asymmetry  measures, including  differ¬ 
ences  in  area  and  radial  length.  Thus,  these 
results  are  new  evidence  for  alignment  early 
in  symmetry  detection.  For  this  reason,it  is 
entirely  possible  that  structural  alignment  is 
used  for  both  very  early  and  much  later  forms 
of  regularity  detection. 
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Figure2.  Stylized  redrawingof  Figure . 


Figure  3.  Regularity  found  by  MAGI  astructural 
alignment. 
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ALIGNABLE  DIFFERENCES  IN 
COMPARISON 

The  MAGI  model  explains  how  visual  rep¬ 
etition  can  support  an  understanding  of  concep¬ 
tual  repetition.  However,  we  have  not  yet  ad¬ 
dressed  the  utility  of  differences  in  a  diagram. 

Of  course,  difference  detection  might  be 
seen  as  a  very  different  process  than  repetition 
detection.  In  Tversky’s  influential  contrast 
model  of  comparison  (1977),  similarity  increas¬ 
es  as  a  function  of  common  features,  and  de¬ 
creases  as  afunction  of  mismatched  features.  If 
individual  features  are  assumed  independent, 
the  detection  of  matched  features  would  nei¬ 
ther  encourage  nor  block  the  detection  ofmis- 
matched  features. 

Studies  by  Markman  and  Centner,  how¬ 
ever,  found  evidence  that  alignment  signifi¬ 
cantly  affected  the  kinds  of  differences  hu¬ 
man  participants  noticed,  with  most  differ¬ 
ences  directly  linked  to  preexisting  aligned 
commonalities  (and  thus  called  alignnhie  dif- 
ferences).  This  model  predicts  that  increas¬ 
ing  the  similarity  of  two  concepts  also  in¬ 
creases  the  number  of  alignable  differences 
noticed.  This  prediction  was  borne  out  in  their 
experiments.  When  human  participants  were 
asked  to  list  differences  between  high  and  low 
similarity  word  pairs  (Markman  &  Centner, 
1993), participants  consistently  listed  more 
alignable  differences  for  pairs  with  high  sim¬ 
ilarity  (hotels  and  motels)  than  for  pairs  with 
low  similarity(magazinc  and  kitten).  A  sec¬ 
ond  set  of  experiments  (Markman  &  Cent¬ 
ner,  1996), generalized  the  results  for  word 
pairs  to  pairs  of  pictures,  and  also  showed  that 
alignable  differences  had  a  greater  effect  on- 
participants’  judgment  of  similarity  than  did 
nonalignable  differences.  When  determining 
differences  between  twothings,  people  seem 
to  focus  more  on  alignable  than  non  align¬ 
able  differences. 

Because  alignable  differences  are  produced 
more  often  than  nonalignable  differences,  and 
because  they  have  a  greater  influence  on  partic¬ 
ipants’  similarity  Judgments,  it  is  safe  to  assume 
that  alignable  diffcrences  are  critical  to  the  con¬ 


trasts  undertaken  in  repetition -based  diagrams. 
Because  alignable  differences  are  easily  gener¬ 
ated  in  the  context  of  structural  alignment, visual 
alignable  differences  may  communicate  their 
points  very  effectively. 

We  conjecture  that  structural  alignment  has 
a  profound  effect  ondiagram  understanding. 
Visual  alignment  supports  conceptual  align¬ 
ment,  andalso  highlights  alignable  differences. 

THREE  PROBLEMS  OF  DIAGRAM 
S'H’LE 

Which  factors — by  analogy  with  under¬ 
standing  writtenprose — make  a  diagram  more 
comprehensible?  As  we  have  seen, repetition  in 
diagrams  should  be  visually  apparent,  and 
should  draw  the  reader  into  a  deeper  conceptu¬ 
al  comparison  without  causing  missteps  or  mis¬ 
alignments. Alignable  differences  should  be 
salient  and  should  servT  the  point  of  the  dia¬ 
gram.  These  criteria  suggest  three  general  types 
of  design  defects  that  may  hinder  comprehen¬ 
sion  ofrepetit ion -based  diagrams. 

VisunUconceptunl  cross-mappings.  Cross- 
mappings(Centncr  &  Toupin,  1986)  occur 
whensurface  information  and  relational  infor¬ 
mation  suggest  different  mappings  for  the  same 
objects.  Visual  cross-mappings  occur  when  two 
objects  arc  visually  alignable,  but  the  roles  or 
functions  of  the  aligned  objects  arc  not  equiva¬ 
lent.  For  example,  if  two  oblong  objectsmatch, 
but  one  is  a  metal  bar  conducting  heat,  and  an¬ 
other  the  handle  of  acontaincr,  the  initial  visu¬ 
al  correspondence  between  the  parts  mightcon- 
fuse  readers.  The  readers  might  seek  some  com¬ 
mon  functional  role  between  the  two  objects, 
and  find  none,  slowing  them  down. 

Alignable  differences  that  arc  either  not 
salient  or  not  compelling.  Some  alignable  dif¬ 
ferences  are  more  noticeable  than  others. In  our 
example  diagram,  for  instance,  many  people 
find  the  difference  in  the  number  of  water  drop¬ 
lets  easier  to  spot  than  the  difference  in  thick¬ 
ness  for  the  two  metal  bars. 

We  do  not  yet  have  a  thcor>'  of  what  makes 
alignable  differences  salient  or  compelling.  Un¬ 
derstanding  salience  alone  requires  a  more  com- 
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plex  model  of  visual  attention  than  we  have 
available.  However,  we  can  define  techniques 
to  make  alignable  differences  either  more  sa¬ 
lient  or  more  compelling,  a  process  we  call  dif¬ 
ference  amplification. 

We  can  make  alignable  differences  more 
salient  by  either  adding  additional  alignable 
structure  that  draws  attention  to  that 
difference, or  by  other  techniques,  such  as  col¬ 
or.  Besides  making  differences  more  salient,  we 
can  also  make  them  more  compelling  by  mak¬ 
ing  the  importance  of  the  difference  more  evi¬ 
dent.  We  do  this  by  making  it  easier  for  the  di¬ 
agram  reader  to  link  the  visual  alignable  differ¬ 
ences  to  the  conceptual  differences  underlying 
the  diagram’s  point.  Labeling  is  the  easiest  way 
to  accomplish  this. 

Aligned  differences  unrelated  to,  or  in¬ 
terfering  with,  the  pointof  diagram.  When 
alignable  differences  exist,  they  should  be  re- 
latedto  the  diagram’s  point.  Some  alignable 
differences  may  be  irrelevant;  if  our  diagram 
had  one  cup  colored  red,  and  the  other  blue, 
this  difference  would  be  obvious  but  unlike¬ 
ly  to  confuse  the  reader.  Alignable  differenc¬ 
es  may  detract  from  a  diagram  when  they  ap¬ 
pear  to  be  related  to  the  point  of  the  diagram, 
but  are  not.  If  one  cup  was  being  heated  with 
a  burner  in  our  diagram,  we  might  be  con¬ 
fused  about  how  this  particular  difference 
relates  to  the  role  of  the  thicker  metal  bar, 
since  both  the  flame  and  relative  bar  thick¬ 
ness  would  affect  the  rate  of  heat  flow.  ‘  Such 
alignable  differences  make  it  more  difficult 
to  draw  a  conclusion  from  the  diagram,  and 
thus  hinder  the  reader’s  ability  to  comprehend 
the  point  of  the  juxtaposed  situations. 

To  summarize,  the  MAGI  model  and  Mark- 
man  and  Gentner’salignable  difference  model 
suggest  three  ways  in  which  a  diagram  can  be- 
confusing.  First,  it  may  contain  a  visual-con¬ 
ceptual  cross-mapping.  Second,  alignable  dif¬ 
ferences  may  not  be  salient.Finally,  alignable 
differences  may  be  irrelevant  or  may  interfere 
with  the  point  of  the  diagram.  These  three  cri¬ 
teria  can  be  easily  characterized  in  terms  of  the 
MAGI  model  and  some  simple  assumptions 
about  visual  representation. 


Because  these  stylistic  problems  can  be 
cleanly  described  in  terms  of  the  MAGI  mod¬ 
el,  it  is  possible  to  build  a  diagram  critic  that 
uses  these  principles  to  parse  and  critique  dia¬ 
grams.  We  can  use  mismatches  between  corre¬ 
spondences  at  the  visual,  physical  and  process 
levels  to  determine  how  well  the  visual  regu¬ 
larity  in  the  figure  guides  the  comparison.  If 
we  have  a  representation  of  the  diagram’s  point 
(which  often  can  be  derived  from  the  caption), 
we  can  also  determine  if  the  alignable  differ¬ 
ences  in  the  figure  convey  the  point,  are  orthog¬ 
onal  to  the  point,  or  get  in  the  way  of  under¬ 
standing  the  point. 

We  have  built  such  a  system,  called  JUX- 
TA^.  Given  diagrams  that  juxtapose  physical 
situations,  JUXTA  can  produce  a  critique  of  the 
figure,  and  note  differences  that  may  confuse 
ordistract  the  reader.  JUXTA  also  amplifies  a 
diagram’s  relevantalignable  differences  by  la¬ 
beling  them,  using  its  physical  knowledge  to- 
create  and  place  useful  explanatory  labels. 

We  now  summarize  how  JUXTA  works. 

Tm.  JUXTA  ARCHITECTURE 

Figure  4  describes  JUXTA’ s  architecture. 
JUXTA’s  inputis  a  stylized  line  drawn  diagram 
and  a  representation  of  the  diagram’s  caption. 
It  provides  three  kindsof  feedback.  First,  it 
amplifies  relevant  alignable  differences  byla¬ 
beling  them  with  process  descriptions.  Second, 
it  critiques  differencesthat  interfere  with  the 
point  of  the  diagram  (as  given  in  the  caption). 
Finally,  it  notes  differences  thatare  orthogonal 
to  the  point  of  the  diagram,  and  thus  may  be 
removed  at  thede^igner’s  discretion. 


1  Tufte  (1997)  gives  an  example  of  how  this  princi¬ 
ple  is  violatedin  the  “before  and  after”  drawings  done  by 
the  19th  century  architect  Humphrey  Repton.  Repton’s 
“after”views  often  embellish.  For  example,  a  landscaping 
proposal  adds  changes  to  the  “after”  view  that  are  appeal¬ 
ing  but  are  unrelated  to  the  proposed  modification.such  as 
stylishly  dressed  people  on  the  sidewalks  and  fine  sailing 
ships  inthe  adjacent  harbor. 

2  JUXTA  stands  for  Juxtaposition  Understanding  and 
Explanation  Through  Analogy. 
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Processing  the  figure 

First,  JUXTA  (using  the  GeoRep  visual 
representation  engine)  represents  the  diagram 
at  three  different  levels — visual  level  (e.g.  a 
square),  a  physical  level  (an  icecube),  and  aph- 
ysical  process  level  (heat  flowing  into  an  ice 
cube)  using  a  set  of  rulesand  low-level  visual 
routines  (Figure  5). 

JUXTA  uses  a  simplified  model  of  object 
recognition,  which  depends  on  a  set  of  rules  to 
determine  when  a  set  of  visual  entities  repre¬ 
sent  a  particular  type  of  structured  object.  The 
heuristics  used  for  object  recognition  are  sum¬ 
marized  inTablel .  This  technique  requires  the 
use  of  stylized  diagrams,  but  otherwise  retains 
much  of  the  flexibility  of  general  diagrams.  For 
example,  objects  can  be  drawn  using  a  drawing 
program,  object  dimensions  can  vary  as  need¬ 
ed,  and  diagram  parts  can  be  composed  into 
more  comprehensive  scenes 

Of  course,  JUXTA  also  needs  a  represen¬ 
tation  of  the  caption,  which  isassumed  to  con¬ 
tain  the  point  of  the  diagram.  To  avoid  doing 
natural  language  interpretation,  we  give  JUX¬ 
TA  the  representation  of  the  caption  directly. 
The  representations  use  Qualitative  Process 
Theory  (Forbus,  1984).  It  is  useful  to  identify 
two  parts  of  captions  for  juxtaposition  diagrams, 
the  antecedent  and  consequent.  In  this  caption, 
the  antecedent  is  the  difference  in  thickness  of 


the  bars  and  the  consequent  is  the  difference  in 
the  rates  of  heat  flow. 


Finding  regularity  and  differences 


JUXTA  runs  MAGI  on  the  figure  to  detect 
correspondences  (Figure  3).JUXTA  then  uses 
a  simple  mechanism  for  detecting  alignable  dif¬ 
ferences  based  on  finding  differences  in  dimen¬ 
sions  predetermined  by  the  object  category.For 
example,  when  two  trapezoids  correspond, 
JUXTA  compares  their  height  and  length.  In¬ 
visible  differences,  such  as  differences  in  the 
rate  of  aphysical  process,  are  inferred  from  vi¬ 
sual  differences  via  rules  in  adomain-dependent 
knowledge  base.  For  example,  if  the  two  trape¬ 
zoids  represent  two  cups,  and  one  trapezoid  is 
larger,  then  JUXTA  infers  that  the  cup  repre¬ 
sented  by  that  trapezoidhas  greater  volume.  This 
way,  visible  diffcrences  enable  JUXTA  to  in¬ 
fer  deeper  conceptual  differences. 

Amplifying  differences  via  labeling 


At  this  point,  JUXTA  now  has  analyzed  the 
figure  at  the  visual,  physical, and  process  lev¬ 
els.  It  also  has,  for  each  of  those  levels,  com¬ 
puted  the  representation  of  that  level,  the  regu¬ 
larity  mapping  for  that  level,  and  the  set  of 
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Tablet,  Visual  legend  forrecognized  objects. 

aligned  differences.lt  can  now  begin  its  critical 
analysis  of  the  figure.  First,  JUXTA  attempts 
to  link  the  aligned  differences  to  the  anteced¬ 
ents  and  consequences  of  the  point  given  in  the 
caption.  It  then  amplifies  the  aligned  differences 
by  labeling  them.  The  labels  link  each  key  dif¬ 
ference  in  the  caption  to  some  visual  difference. 

To  link  the  objects  in  the  diagram  with  the 
referents  in  the  caption  representation,  JUXTA 
matches  the  caption  representation  against  the 
physical  and  process  representations  of  the  di¬ 
agram,  and  uses  this  match  to  fill  the  caption 
representation’s  unfilled  slots.  This  is  how,  for 
example,  JUXTA  figures  out  which  objector 
objects  the  caption’s  “thicker  bar”  refers  to.In 
this  case,  JUXTA  can  find  the  thicker  bar  on 
the  right  using  the  common  object  category 
(metal  bar)  to  select  both  metal  bars,  and  using 
thealignable  difference  (thicker)  to  distinguish 
between  them. 

Once  JUXTA  understands  which  parts  of  the 
figure  are  being  referenced  in  the  caption,  it  la¬ 
bels  the  differences.  This  involves  constructing- 
paired  labels  for  each  alignable  difference  given 
in  the  caption,  and  then  determining  where  to 
place  each  label.To  label  an  alignable  difference, 
JUXTA  must  find  a  visible  referent  to  point  to. 
When  an  alignable  difference  isalong  a  visible 
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Figure  6.  Results  of  labeling  stage  of  JUXTA  on 
examplediagram. 

dimension  (such  as  the  thickness  of  a  bar),  the 
object  itself  is  the  referent  of  the  label,  and  JUX¬ 
TA  points  to  the  shape  which  represents  the  phys¬ 
ical  object.  Alternatively,  when  a  caption  rela¬ 
tionship  is  not  visible  (such  as  heat  flowalong 
the  metal  bar),  JUXTA  looks  for  a  consequence 
of  the  relationship  which  is  visible  difference. 
In  the  example  figure,  the  difference  in  heatfiow 
causes  a  difference  in  the  rate  at  which  the  ice 
cube  melts,  causing  a  visible  difference  in  the 
number  of  drops  (ellipses),  so  JUXTA  labels  this. 
The  result  of  the  labeling  stage  on  the  example 
diagram  is  given  in  Figure  6. 

Critiquing  the  diagram 

After  labeling  the  figure,  JUXTA  critique- 
show  well  the  alignable  differences  contribute 
to  the  point  of  the  caption.To  do  this,  JUXTA 
looks  at  all  alignable  differences  left  over  from 
the  labeling  stage.  These  are  differences  that 
arenot  related  to  alignable  differences  referenced 
in  the  caption.  If  are  maining  alignable  differ¬ 
ence  is  not  the  result  of  the  caption  antecedent, but 
can  have  an  effect  on  its  consequent,  JUXTA 
notes  it  as  potentially  confusing.  For  example, 
Figure  7  is  a  variant  of  our  example  diagram  that 
contains  this  problem.  Here,  the  amount  of  heat 
rising  from  the  second  cup  is  larger  than  the  first 
container.  JUXTA  notes  this  difference  as  con¬ 
fusing  because  the  amount  of  heat  from  the  con¬ 
tainer  implies  that  the  second  container  may  con¬ 
tain  a  hotter  liquid,  which  would  also  increase 
the  heat  flow  rate. 

Of  course,  remaining  alignable  differenc¬ 
es  may  not  relate  to  the  caption  at  all.  In  this 
case,  JUXTA  will  not  mark  it  as  confusing,  but 
will  note  the  orthogonal  status  of  the  alignable 
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difference.  For  example, in  Figure  7.  JUXTA 
will  note  that  one  spline  curve  in  the  leftmost 
group  is  longer.  Removing  this  differences 
might  make  interpretation  somewhat  simpler, 
but  it  will  not  cause  problems  if  left  unchanged. 

CONCLUSION 

Analogical  encoding  techniques,  based  on 
current  models  of  analogy  and  similarity,  can 
provide  key  insights  into  diagrammatic  reason¬ 
ing.  We  have  shown  how  MAGI,  which  uses 
structure  mapping  to  detect  repetition  and  sym¬ 
metry,  may  explain  how  visual  and  conceptual 
regularity  support  one  another,  and  how  align- 
able  differences  emphasize  relevant  points.  TTiis 
model  is  strong  enough  to  build  a  system, 
JUXTA, that  can  parse,  analyze,  and  critique  a 
diagram  by  analyzing  how  correspondences  and 
differences  interact  between  the  visual,  physi¬ 
cal  andprocess  levels. 

While  JUXTA  demonstrates  the  basic 
principles  behind  a  whole  class  of  diagram¬ 
matic  reasoners,  the  current  implementation 
is  limited.  JUXTA  has  only  been  used  on  a 
handful  of  figures.  The  recognition  of  objects 
and  processes  remains  brittle.We  are  explor¬ 
ing  similarity-based  feature  re-interpretation 
as  onemechanism  for  improving  the  system’s 
flexibility. 

JUXTA  also  deals  solely  with  diagrams  that 
use  binary  repetition  to  demonstrate  physical 
laws.  In  practice,  diagrams  use  many  types  of- 
regularity,  including  matrices,  multiple  repeat¬ 
ed  items,  sequences,  and  symmetry.  To  expand 
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Figure  7.  A  faultyvariant  of  Figure  I. 


the  kinds  of  regularity  JUXTA  handles,  MAGI 
itself  may  need  to  be  extended  to  handlesome 
forms  of  n-ary  symmetry  and  repetition.  This 
problem  relates  more  topsychology  than  pro¬ 
gramming — althoughit  is  relatively  simple  to 
configure  a  version  of  MAGI  that  recognizess- 
mall  multiples  of  a  scene,  it  does  not  yet  do  so 
in  an  efficient  way,  nordoes  it  reflect  our  un¬ 
derstanding  of  how  humans  recognize  other 
forms  of  regularity.  We  expect,  however, that 
JUXTA  soon  handle  symmetry  as  well  as  re¬ 
peating  diagrams. 

The  just-mentioned  variety  of  diagrammat¬ 
ic  regularity  speaks  to  the  fascinating  richness 
of  this  particular  sub-area  of  cognition.  If  thes- 
imple  mechanisms  of  JUXTA  can  be  extended 
to  a  larger  range  of  diagrams, they  may  not  only 
provide  a  foundation  for  computer  systems  that 
can  understand  diagrams  in  amore  human-like 
fashion,  but  may  also  have  interesting  conse¬ 
quences  for  our  understanding  of  diagrammat¬ 
ic  reasoning,  regularity,  and  analogy. 
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ABSTRACT 

This  paper  outlines  the  main  ideas  and  ob¬ 
jectives  of  the  Metacat  project,  an  extension  of 
the  Copycat  computer  model  of  analogy-mak¬ 
ing  and  high-level  perception.  The  principal 
features  of  Metacat  that  allow  it  to  make  sense 
of  analogies  suggested  to  it  by  the  user  are  de¬ 
scribed  using  a  simple  example. 

INTRODUCTION 

The  Copycat  computer  model  of  analogy¬ 
making  and  high-level  perception  was  origi¬ 
nally  developed  by  Hofstadter  &  Mitchell  as 
a  computational  model  of  subcognitive  mech¬ 
anisms  underlying  human  cognition,  in  which 
the  notion  of fluid  concepts  plays  a  central  role. 
Copycat  models  the  process  of  analogy-mak- 
ing  within  a  stripped-down  microworld  of  tiny, 
idealized  situations  represented  as  short  strings 
of  letters.  For  example,  a  typical  Copycat  prob¬ 
lem  is  the  following:  “If  abc  changes  to  abd, 
how  does  mrrjjj  change  in  an  analogous 
way?”  This  microworld,  though  austere,  har¬ 
bors  a  surprisingly  rich  variety  of  subtle  prob¬ 
lems  in  which  a  wide  range  of  answers  is  al¬ 
most  always  possible — often  including  deep¬ 
ly  elegant  but  non-obvious  ones.  For  exam¬ 
ple,  there  are  many  defensible  answers  to  the 
above  problem,  including  mrrkkk,  mrrjjk, 
mrrjjd,  mrrddd,  mrrjjj  (in  which  only  c’s 
are  seen  as  changing),  mrsjjj,  mrdjjj,  mrrjjjj, 
mrrkkkk,  or  even  abd  or  abbddd.  The  ap¬ 


parent  simplicity  of  Copycat’s  domain  is  de¬ 
ceptive,  for  it  remains  a  formidable  challenge 
to  develop  a  computational  model  exhibiting 
a  level  of  creative  and  flexible  behavior  com¬ 
parable  to  that  of  humans  even  in  this  tiny, 
re.stricted  domain  of  letter-strings. 

Copycat  discovers  analogies  between  dif¬ 
ferent  situations  by  building  up  an  under¬ 
standing  of  the  situations  in  terms  of  concepts 
that  it  understands  about  the  letter-string 
world.  Representations  of  these  concepts  are 
hard-wired  into  the  program,  yet  they  are  not 
static  entities  with  sharply  defined  bound¬ 
aries.  Rather,  their  boundaries  are  inherently 
fuzzy,  overlapping  each  other  to  varying  de¬ 
grees  and  changing  in  response  to  competing 
contextual  pressures  that  arise  during  the 
course  of  processing.  The  dynamic,  “fluid” 
nature  of  Copycat’s  concepts  is  intended  to 
model  the  extremely  flexible  human  ability 
to  perceive  dissimilar  things  as  being  in  fact 
“the  same”  when  viewed  at  some  appropri¬ 
ate  level  of  description. 

A  detailed  exposition  of  the  Copycat  pro¬ 
gram  can  be  found  in  (Mitchell,  1993]  and 
[Hofstadter  and  FARG,  1995].  In  this  paper, 
we  give  just  a  brief  summary  of  Copycat  and 
then  discuss  in  more  detail  recent  work  aimed 
at  extending  the  model.  The  goal  of  the  cur¬ 
rent  project,  dubbed  Mctacat,  is  to  increase 
the  program’s  “awareness”  of  its  own  behav¬ 
ior  as  it  solves  analogy  problems,  so  that  it 
may  gain  deeper  insights  into  the  analogies 
it  makes. 
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THE  COPYCAT  MODEL 

When  Copycat  is  given  an  analogy  prob¬ 
lem  to  work  on,  it  starts  out  with  the  letter- 
strings  in  its  Workspace,  the  architectural  com¬ 
ponent  of  the  program  in  which  all  perceptual 
processing  occurs.  Small,  nondeterministic  pro¬ 
cessing  agents  called  codelets  notice  relations 
among  the  individual  letters  and  build  new 
structures  around  them,  organizing  them  into  a 
coherent  high-level  picture.  All  processing  oc¬ 
curs  through  the  collective  actions  of  many 
codelets  working  in  parallel,  at  different  speeds, 
on  different  aspects  of  an  analogy  problem, 
without  any  centralized  executive  process  con¬ 
trolling  the  course  of  events.  The  stochastic 
behavior  of  codelets  is  dynamically  biased  by 
the  time-varying  pattern  of  activation  in  the 
program’s  network  of  concepts,  called  Xht  Slip- 
net,  that  it  uses  to  build  up  an  understanding  of 
an  analogy  problem.  In  turn,  this  context-de¬ 
pendent  pattern  of  conceptual  activity  in  the 
Slipnet  is  itself  an  emergent  consequence  of 
codelet  processing  in  the  Workspace. 

For  example,  in  order  to  discover  an  an¬ 
swer  to  the  problem  ‘‘abc  =>  abd;  mrrjjj  => 
?”,  codelets  work  together  to  build  up  a  strong, 
coherent  mapping  between  iht  initial  string  abc 
and  the  target  string  mrrjjj,  and  also  between 
the  initial  string  and  the  modified  string  abd. 
Within  each  letter-string,  codelets  attempt  to 
build  hierarchical  groups,  effectively  organiz¬ 
ing  the  strings  (the  raw  perceptual  data)  into 
coherent,  chunked  wholes.  In  mrrjjj,  for  ex¬ 
ample,  codelets  might  build  the  “sameness- 
groups”  rr  and  jjj,  causing  Xht  sameness-group 
concept  in  the  Slipnet  to  become  activated, 
which  in  turn  makes  it  more  likely  for  the  pro¬ 
gram  to  regard  m  as  a  sameness-group  of  length 
one  within  the  context  of  the  other  groups  in  its 
string.  A  higher-level  “successor-group”  com¬ 
prised  of  m,  rr,  and  jjj  encompassing  the  en¬ 
tire  string  can  then  be  seen  based  on  the  con¬ 
cept  of  group-length  (i.e.,  1-2-3)  rather  than 
on  letter-category.  Consequently,  the  letter- 
category-based  successor-group  abc  ^an  be 
mapped  as  a  whole  onto  the  length-based  suc¬ 
cessor-group  mrrjjj,  representing  the  recogni¬ 


tion  of  these  strings  as  instances  of  the  same 
concept,  even  though  their  surface  resemblance 
is  negligible.  The  distributed  nature  of  codelet 
processing  interleaves  the  chunking  process 
with  the  mapping  process,  and  as  a  result,  each 
process  influences  and  drives  the  other. 

A  mapping  consists  of  a  set  of  bridges  be¬ 
tween  corresponding  letters  or  groups  that  play 
respectively  similar  roles  in  different  strings. 
Each  bridge  is  supported  by  a  set  of  concept- 
mappings  that  together  provide  justification  for 
perceiving  the  objects  connected  by  the  bridge 
as  corresponding  to  one  another.  For  example, 
a  bridge  might  be  built  between  c  in  abc  and  jjj 
in  mrrjjj,  supported  by  the  concept-mappings 
rightmost  =>  rightmost  and  letter  =>  group, 
representing  the  idea  that  both  objects  are  right¬ 
most  in  their  strings,  and  that  one  is  a  letter  and 
the  other  a  group.  Non-identity  concept-map¬ 
pings  such  as  letter  =>  group  are  called  slip¬ 
pages,  and  form  the  basis  of  Copycat’s  ability 
to  perceive  superficially-dissimilar  situations  as 
being  identical  at  a  deeper  level. 

Once  a  strong,  coherent  mapping  has  been 
built  between  the  initial  string  and  the  modi¬ 
fied  string,  another  type  of  structure,  called  a 
rule,  may  get  created  based  on  this  mapping, 
which  succinctly  describes  the  way  in  which 
the  initial  string  changes  into  the  modified 
string.  There  are  often  several  possible  ways  of 
describing  this  change,  some  more  abstract  than 
others.  For  example,  two  possible  rules  for  abc 
=>  abd  are  Change  letter-category  of  rightmost 
letter  to  successor  and  Change  letter-category 
of  rightmost  letter  tod.  ' 

Different  ways  of  looking  at  the  initial/ 
modified  change,  combined  with  different  ways 
of  building  the  initial/target  mapping,  give  rise 
to  different  answers.  The  configuration  of  struc¬ 
tures  in  the  Workspace  collectively  represents 
the  way  in  which  a  given  analogy  problem  is 
interpreted.  A  particular  interpretation  implies 
a  particular  answer  for  the  problem.  To  produce 
an  answer,  the  rule  describing  the  way  the  ini¬ 
tial  string  changes  is  translated  into  a  new  rule 
that  applies  to  the  target  string,  based  on  the 
slippages  underlying  the  initial/target  mapping. 
For  example,  if  the  abc  =>  abd  change  is  de- 
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scribed  according  to  the  first  rule  above,  and 
the  abstract  successor-group  similarity  between 
abc  and  mrrjjj  has  been  noticed,  then  the  rule 
will  be  translated  as  Change  length  of  right¬ 
most  group  to  successor,  yielding  the  answer 
mrrjjjj.  On  the  other  hand,  if  this  deep  simi¬ 
larity  has  not  been  noticed,  the  answers  mr- 
rkkk,  mrrjjk,  mrrddd,  or  mrrkkd  may  be 
found  instead,  depending  on  the  rule  chosen  and 
whether  or  not  c  in  abc  is  seen  as  correspond¬ 
ing  to  the  Jjj  group  or  to  just  the  rightmost  let¬ 
ter  jin  mrrjjj. 

As  this  example  suggests,  Copycat’s  sto¬ 
chastic  processing  mechanisms  enable  it  to  find 
a  range  of  different  answers  for  a  given  analo¬ 
gy  problem.  Copycat  attaches  a  rough  numeri¬ 
cal  measure  of  “quality”  to  the  answers  it  finds, 
which,  for  many  problems,  corresponds  reason¬ 
ably  well  to  human  judgments  of  relative  an¬ 
swer  quality.  But  the  program  has  very  little 
awareness  of  how  it  actually  finds  the  answers 
that  it  finds.  It  has  almost  no  insight  into  its 
own  processing  mechanisms — fluid  and  flexi¬ 
ble  though  they  may  be — which  guide  it  through 
the  “space”  of  possible  interpretations  of  an 
analogy  problem.  This  is  not  too  surprising, 
however,  given  that  Copycat  was  intended  pri¬ 
marily  as  a  model  of  subcognitive  mechanisms. 
All  of  the  nondeterministic  codelet  activity  oc¬ 
curring  in  the  Workspace — the  building  of 
bridges  and  groups,  the  making  of  slippages, 
and  so  on — is  intended  to  represent  perceptual 
activity  carried  out  below  the  level  of  “con¬ 
scious  awareness”.  In  contrast,  the  focus  of 
Metacat  is  on  developing  mechanisms  that  sup¬ 
port  a  higher  “cognitive”  level  on  top  of  Copy¬ 
cat’s  subcognitive  level.  To  do  this,  Metacat 
needs  to  be  able  to  remember  what  happens 
while  its  subcognitive  mechanisms  are  build¬ 
ing,  destroying,  and  reconfiguring  Workspace 
structures  in  pursuit  of  an  answer  to  the  prob¬ 
lem  at  hand,  and  to  build  explicit  representa¬ 
tions  of  this  activity. 

METACAT’S  OBJECTIVES 

Hofstadter  has  outlined  several  important 
objectives  for  the  Metacat  project  (Hofstadter 

120 


and  FARG,  1995,  Chapter  7],  First  of  all,  the 
program  should  be  able  to  explicitly  character¬ 
ize  the  essence  of  an  answer — ^the  core  idea  or 
cluster  of  ideas  underlying  the  answer  that  fun¬ 
damentally  distinguishes  it  from  other  possi¬ 
ble  answers.  The  ability  to  perceive  what  a  giv¬ 
en  answer  is  really  “about”  should  enable  the 
program  to  give  at  least  a  limited  explanation 
of  the  answer’s  strengths  and  weaknesses  com¬ 
pared  to  other  answers  it  may  have  previously 
found.  For  example,  the  essence  of  the  mrrjjjj 
answer  described  earlier  lies  in  seeing  both  abc 
and  mrrjjj  as  successor-groups,  one  based  on 
the  concept  of  letter-category'  and  the  other 
based  on  the  concept  of  group-length.  The  rec¬ 
ognition  of  this  abstract  similarity  between  the 
strings  is  what  fundamentally  distinguishes  the 
answer  mrrjjjj  from  other,  more  straightfor¬ 
ward  answers  such  asmrrkkk,  mrrjjk,  or  mr¬ 
rddd,  in  which  the  hidden  “successorship  fab¬ 
ric”  of  mrrjjj  remains  unnoticed. 

The  ability  to  compare  and  contrast  an¬ 
swers,  however  implies  the  ability  to  remem¬ 
ber  more  than  one  at  a  time.  In  Copycat,  an¬ 
swers  are  not  retained  after  they  are  found. 
When  Copycat  discovers  an  answer  to  a  prob¬ 
lem,  it  simply  reports  the  answer,  along  with 
the  answer’s  numerical  measure  of  quality,  and 
then  stops.  No  recollection  of  previously  found 
answers  is  possible  on  subsequent  runs  of  the 
program,  so  there  is  no  way  for  the  program  to 
bring  its  past  experience  to  bear  on  its  current 
situation.  This  makes  comparison  of  different 
answers  impossible,  cither  within  a  single  anal¬ 
ogy  problem  or  across  different  problems.  In 
contrast,  Metacat  should  remember  the  answers 
it  finds,  along  with  characterizations  of  the  key 
ideas  involved,  gradually  building  up  in  its 
memory  a  repertoire  of  experience  on  which  it 
can  draw  when  confronted  with  new  situations. 

In  addition  to  remembering  the  answers  it 
finds,  Metacat  should  also  keep  track  of  pat¬ 
terns  that  occur  in  its  own  processing  while  it 
is  trying  to  discover  new  answers.  As  it  works 
on  an  analogy  problem,  it  should  create  an  ex¬ 
plicit  sequential  trace  of  its  own  behavior  as  it 
searches  through  the  space  of  possible  interpre¬ 
tations  leading  to  different  answers.  This  type 
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of  memory  is  of  a  more  short-term,  temporal 
nature  than  that  just  described  for  the  answers 
themselves.  Such  a  self-watching  ability  would 
enable  Metacat  not  only  to  remember  the  im¬ 
portant  events  that  led  it  to  find  an  answer,  but 
also  to  recognize  when  it  has  fallen  into  a  re¬ 
petitive  or  otherwise  unproductive  pattern  of 
behavior.  Recognizing  that  it  is  in  a  “rut”  should 
enable  it  to  subsequently  “jump  out  of  the  sys¬ 
tem”  by  explicitly  focusing  on  ideas  other  than 
the  ones  that  seem  to  be  leading  it  nowhere. 
This  type  of  self-awareness  pervades  human 
cognition.  People  can  easily  pay  attention  to 
patterns  in  their  own  thinking;  see  for  example 
[Chi  et  al.,  1989,  VanLehn  et  al.,  1992]. 

Once  Metacat  has  the  ability  to  size  up  the 
answers  it  finds  in  terms  of  their  essential  fea¬ 
tures,  it  ought  to  be  able  to  evaluate  other  an¬ 
swers  suggested  to  it  by  some  outside  agent.  In 
other  words.  Metacat  should  not  only  be  able 
to  come  up  with  answers  to  analogy  problems 
on  its  own,  it  should  also  be  able  to  justify  an¬ 
swers  on  their  own  terms,  even  if  the  program 
itself  didn’t  come  up  with  them.  This  consti¬ 
tutes  an  ability  to  work  “backwards”  from  a 
given  answer  toward  an  insightful  characteriza¬ 
tion  of  the  answer,  in  order  to  understand  why 
it  makes  sense.  Once  an  answer  has  been  un¬ 
derstood  in  this  way,  it  could  then  be  compared 
and  contrasted  with  other  answers  that  the  pro¬ 
gram  has  either  itself  discovered  previously,  or 
been  shown  by  someone  else. 

THE  METACAT  MODEL 

The  Metacat  architecture  includes  all  of 
Copycat’s  main  architectural  components,  such 
as  the  Workspace,  the  Slipnet,  and  the  mecha¬ 
nisms  that  support  distributed,  nondeterminis- 
tic  codelet  processing.  In  addition,  new  archi¬ 
tectural  components  have  been  incorporated 
into  the  model,  and  mechanisms  for  building 
bridges  and  creating  rules  have  been  extended 
and  generalized.  These  components  provide  a 
general  framework  in  which  to  address  the  ob¬ 
jectives  outlined  in  the  previous  section. 

Unlike  Copycat,  Metacat  incorporates  a 
memory  for  its  answers,  which  allows  it  to  re¬ 


member  more  than  one  answer  over  the  course 
of  a  run.  Whenever  it  finds  a  new  answer,  in¬ 
stead  of  simply  stopping,  Metacat  pauses  to 
display  the  answer  along  with  the  Workspace 
structures  representing  the  interpretation  of  the 
problem.  This  information  is  packaged  togeth¬ 
er  and  stored  in  Metacat’s  memory,  after  which 
the  program  continues  searching  for  alternative 
answers  to  the  problem.  Gradually  over  time,  a 
series  of  answers  accumulates  in  memory,  each 
one  representing  a  different  way  of  making 
sense  of  the  analogy  problem  at  hand. 

The  most  important  type  of  auxiliary  in¬ 
formation  stored  with  answers  consists  of 
structures  called  themes.  Themes  reside  in 
Metacat’s  Themespace,  and  represent  key  con¬ 
cepts  underlying  the  mappings  created  be¬ 
tween  letter-strings.  Collections  of  themes 
serve  as  high-level  characterizations  of  Meta¬ 
cat’ s  answers,  and  provide  a  basis  on  which  to 
compare  and  contrast  answers  with  each  oth¬ 
er.  Themes  are  comprised  of  Slipnet  concepts, 
and  assume  time-varying  levels  of  activation 
ranging  from  -100  to  +100,  depending  on  the 
extent  to  which  the  ideas  they  represent  are 
present  or  absent  in  a  particular  configuration 
of  Workspace  structures. 

Unlike  Copycat,  Metacat  allows  the  user 
to  suggest  a  particular  answer  to  a  given  anal¬ 
ogy  problem.  The  program  then  tries  to  find 
an  interpretation  of  the  problem  that  leads  to 
the  answer  in  question.  As  an  example,  con¬ 
sider  the  problem  “abc  =>  abd;  xyz  =>  ?” 
with  the  answer  wyz  suggested  to  the  program 
by  the  user.  When  run  on  this  problem,  Meta¬ 
cat  attempts  to  justify  the  wyz  answer  by 
searching  for  an  overall  interpretation  of  the 
problem  in  which  this  particular  answer  makes 
sense.  After  several  hundred  codelets  have 
been  run,  structures  built  in  the  Workspace 
typically  include  horizontal  bridges  compris¬ 
ing  the  abc  =>  abd  and  xyz  =>  wyz  mappings 
in  whicheach  string  is  seen  as  mapping  onto 
its  counterpart  in  a  straightforward,  left-to- 
right  way  (i.e.,  a-a,  b-b,  c-d,  x-w,  y-y,  and 
z-z  bridges).  Also,  vertical  bridges  map  abc 
and  xyz  onto  each  other  in  a  similarly  straight¬ 
forward,  left-to-right  way  (Le.,  a-x,  b-y,  and 
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c-z).  In  addition,  the  rule  Change  letter-cate¬ 
gory  of  rightmost  letter  to  successor,  describ¬ 
ing  howabc  changes  to  yield  abd,  and  the  rule 
Change  letter-category  of  leftmost  letter  to 
predecessor,  describing  how  xyz  changes  to 
yield  wyz,  both  get  created. 

Several  themes  in  the  Themespace  get  ac¬ 
tivated  in  response  to  the  creation  of  these  var¬ 
ious  Workspace  structures.  Specifically,  four 
horizontal-bridge  themes  characterizing  the 
horizontal  abc  =>  abd  bridges  become  activat¬ 
ed  to  different  degrees.  Two  of  these  themes 
represent  the  ideas  of  letter-category  sameness 
and  letter-category  successorship  within  the  abc 
=>  abd  mapping.  The  a-a  and  b-b  bridges  both 
involve  the  idea  of  letter-category  sameness, 
while  the  c-d  bridge  involves  the  idea  of  suc¬ 
cessorship.  Therefore,  the  themes  Letter- 
Category '.Sameness  and  Letter-Category:  Suc¬ 
cessor  are  both  active  in  the  Themespace,  al¬ 
though  the  successorship  theme  is  not  as  active 
as  the  sameness  theme. 

On  the  other  hand,  all  bridges  map  objects 
of  identical  string -position  {e.g.,  leftmost  => 
leftmost)  and  object-type  {e.g.,  letter  =>  let¬ 
ter)  onto  each  other,  so  the  themes  String- 
Position.'Sameness  and  Object-Type :Sameness 
are  highly  active.  These  themes  together  serve 
as  an  abstract  characterization  of  the  abc  => 
abd  mapping.  Other  sets  of  themes  in  the 
Themespace  characterize  other  Workspace 
structures  in  a  similar  fashion. 

Thus,  themes  are  first  and  foremost  repre¬ 
sentational  structures.  But  under  certain  condi¬ 
tions,  when  highly  activated,  they  can  also  ex¬ 
ert  powerful  top-down  pressure  on  Metacat’s 
processing  mechanisms,  strongly  biasing  the 
stochastic  behavior  of  codelets  in  favor  of  par¬ 
ticular  outcomes.  Active  themes  can  be  regard¬ 
ed  as  Metacat’s  way  of  “seizing  on”  certain  key 
ideas  implicit  in  an  analogy  problem  and  mak¬ 
ing  them  explicit,  driving  the  program  toward 
an  interpretation  of  the  problem  organized 
around  these  ideas. 

In  the  above  example.  Metacat  perceives  abc 
and  xyz  as  successor-groups  going  in  the  same 
direction  (left-to-right).  This  is  represented  by 
the  vertical  a-x  and  c-z  bridges,  which  are  sup¬ 


ported  by  the  concept-mappings  leftmost ->  left¬ 
most  and  rightmost  =>  rightmost,  respectively. 
However,  this  way  of  interpreting  the  situation 
doesn’t  make  sense,  because  c  and  x  are  not  seen 
as  corresponding  to  each  other  (since  there  is  no 
bridge  between  them),  yet  they  are  both  identi¬ 
fied  by  the  rules  as  being  the  objects  that  change 
in  their  respective  strings  (the  c  to  its  successor 
and  the  x  to  its  predecessor). 

At  some  point,  codelets  may  compare  the 
two  rules  and  notice  that  taken  together,  they 
imply  the  concept-mappings  rightmost  =>  left¬ 
most  and  successor  =>  predecessor.  These  con¬ 
cept-mappings  suggest  the  idea  of  mapping  the 
strings  abc  and  xyz  onto  each  other  in  a  cro.Ks- 
wise  fashion,  so  that  one  group  is  viewed  as  a 
successor-group  and  the  other  is  viewed  as  a 
predecessor-group,  with  the  rightmost  letter  of 
one  corresponding  to  the  leftmost  letter  of  the 
other,  and  vice  versa.  This  idea  can  be  succinct¬ 
ly  characterized  by  a  set  of  vertical-bridge 
themes  representing  string-position  and  group- 
direction  oppositeness.  These  themes  are 
clamped  by  codelets  af  full  activation,  strongly 
promoting  the  creation  of  new  structures  com¬ 
patible  with  the  idea  of  a  vertical  crosswise 
mapping  and  greatly  weakening  existing  struc¬ 
tures  incompatible  with  this  idea. 

For  example,  the  a-x  and  c-z  bridges  are 
incompatible  with  the  idea  of  mapping  abc  and 
xyz  onto  each  other  in  opposite  directions,  rep¬ 
resented  by  the  St  ring -Position  .Opposite 
theme,  since  they  are  supported  by  leftmost  => 
leftmost  or  rightmost  =>  rightmost  concept- 
mappings.  They  are  thus  easily  broken  and  re¬ 
placed  by  a-z  and  c-x  bridges,  which  arc  com¬ 
patible  with  this  idea.  The  net  effect  is  that  the 
original  vertical  mapping  described  above  is 
swiftly  reorganized  by  codelets  into  a  new  map¬ 
ping  consistent  with  the  activated  themes. 

Eventually,  the  burst  of  new  structure¬ 
building  activity  caused  by  clamping  the  pat¬ 
tern  of  themes  representing  oppositcncss  sub¬ 
sides,  leaving  a  new  (consistent)  vertical  map¬ 
ping  in  place,  in  which  abc  is  seen  as  a  succes¬ 
sor-group  going  to  the  right  and  xyz  as  a  prede¬ 
cessor-group  going  to  the  left.  This  way  of  look¬ 
ing  at  things  makes  sense  with  respect  to  the 
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wyz  answer,  since  c  and  x  are  seen  as  corre¬ 
sponding.  In  this  way,  themes  allow  Metacat  to 
effectively  work  backwards  from  a  given  an¬ 
swer  to  a  high-level  understanding  of  why  the 
answer  makes  sense. 

In  conclusion.  Metacat’s  themes  can  be 
viewed  as  a  medium  through  which  ideas  made 
explicit  at  the  ‘‘cognitive”  level  can  actively 
influence  and  guide  the  course  of  processing  at 
the  “subcognitive”  level.  By  strongly  activat¬ 
ing  different  patterns  of  themes  in  the 
Themespace,  the  program  can  explicitly  focus 
on  different  high-level  ideas  as  it  works  on  un¬ 
derstanding  an  analogy  problem.  Furthermore, 
once  an  answer  has  been  understood,  its  asso¬ 
ciated  themes  represent  a  characterization  of  the 
key  ideas  underlying  the  answer,  which  can 
subsequently  be  used  as  the  basis  for  compar¬ 
ing  and  contrasting  the  answer  with  other  an¬ 
swers  encountered  previously. 
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ABSTRACT 

This  paper  contrasts  two  views  about  the 
relationship  between  the  processes  of  access 
and  mapping  in  analogy-making.  According  to 
the  modular  view,  analog  access  and  mapping 
are  two  separate  ‘phases*  that  run  sequentially 
and  relatively  independently.  The  interaction- 
ist  view  assumes  that  they  are  interdependent 
subprocesses  that  run  in  parallel.  The  paper  ar¬ 
gues  in  favor  of  the  second  view  and  presents  a 
simulation  experiment  demonstrating  its  advan¬ 
tages.  The  experiment  is  performed  with  the 
computational  model  Ambr  and  illustrates  one 
particular  way  in  which  the  subprocess  of  map¬ 
ping  can  influence  the  subprocess  of  access. 

INTRODUCTION 

A  crucial  point  in  analogy-making  is  the  re¬ 
trieval  of  a  base  (or  source)  analog.  Accessing 
an  appropriate  base  from  the  vast  pool  of  epi¬ 
sodes  stored  in  the  long-term  memory  is  not  only 
a  logical  necessity  (one  cannot  make  analogies 
without  a  source)  but  apparently  is  the  most  dif¬ 
ficult  and  capricious  clement  of  analogy-mak¬ 
ing.  Starting  with  the  classical  experiments  of 
Gick  and  Holyoak  (1980)  it  has  been  repeatedly 
demonstrated  that  people  have  difficulties  in 
spontaneously  accevSsing  a  base  analog,  especial¬ 
ly  when  its  domain  is  very  different  from  that  of 


the  target  problem.  In  the  aforementioned  study 
only  about  20%  of  the  subjects  were  able  to  solve 
the  so-called  radiation  problem  even  though  an 
analogous  problem  (with  solution)  was  present¬ 
ed  shortly  before  the  test  phase.  When  provided 
by  an  explicit  hint  to  use  this  source  analog,  how¬ 
ever,  75%  of  the  subjects  achieved  the  solution. 
This  great  difference  between  the  two  experi¬ 
mental  conditions  was  attributed  to  the  difficul¬ 
ty  of  analog  access. 

On  the  other  hand,  we  know  a  lot  of  stories 
about  great  scientists  making  discoveries  by 
spontaneously  using  remote  analogies.  We  have 
also  personal  experience  in  everyday  usage  of 
remote  analogies.  A  recent  study  by  Wharton, 
Holyoak,  and  Lange  (1996)  has  demonstrated 
that  about  35%  of  their  subjects  were  success¬ 
fully  reminded  about  a  remote  analog  story 
studied  7  days  earlier  when  cued  by  the  target 
story.  (They  have  used  a  directed  reminding 
task,  not  a  problem  solving  task,  however.) 

Researchers  of  analogical  access  have  be¬ 
come  interested  in  the  features  of  a  remote  ana¬ 
log  that  facilitate  retrieval.  Most  data  in  the  field 
(Holyoak  and  Koh,  1987,  Ross  1989)  sugge.st 
that  analogical  access  is  almost  exclusively  guid¬ 
ed  by  superficial  semantic  similarities  between 
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base  and  target — similar  objects  and  relations, 
similar  themes,  similar  story  lines,  etc.  In  con¬ 
trast,  analogical  mapping  is  dominated  by  the 
structural  similarity  between  target  and  base,  i.e. 
having  common  systems  of  relations  (Centner, 
1983, 1989).  This  explains  why  remote  analogs 
are  much  more  difficult  to  access  than  to  map — 
they  lack  the  superficial  similarities  needed  for 
access  but  do  have  the  (quasi)isomorphic  rela¬ 
tional  structure  necessary  for  mapping. 

This  clear  separation  stimulated  the  re¬ 
searchers  in  the  field  to  build  separate  models  of 
mapping  and  retrieval  and  even  to  claim  that  they 
are  different  cognitive  modules.  Thus  Centner 
(1989)  claims  that  ‘the  analogy  processor  (the 
mapping  machine)  is  a  well-defined  separate 
cognitive  module  whose  results  interact  with 
other  processes,  analogous  to  the  way  some  nat¬ 
ural  language  models  have  postulated  semi-au¬ 
tonomous  interacting  subsystems  for  syntax,  se¬ 
mantics,  and  pragmatics.*  Although  she  explic¬ 
itly  mentions  in  a  footnote  that  this  should  not 
be  considered  in  the  Fodorian  sense  as  innate 
and  impenetrable,  the  actual  models  built  are 
quite  impenetrable.  This  line  of  research  has 
generated  a  number  of  quite  successful  models 
that  explained  the  data  and  made  some  new  pre¬ 
dictions.  Typically,  a  model  of  mapping  is  cou¬ 
pled  with  a  (separate)  model  of  retrieval.  The 
best-known  examples  are  SME  +  MAC/FAC 
(Falkenhainer,  Forbus,  and  Centner,  1986;  For- 
bus.  Centner,  and  Law,  1995)  and  ACME  + 
ARCS  (Holyoak  and  Thagard,  1989;  Thagard, 
Holyoak,  Nelson,  and  Cochfeld,  1990). 

However,  the  experimental  work  soon  re¬ 
vealed  that  the  pattern  is  not  that  clear  and 
straightforward.  It  has  been  demonstrated  that 
superficial  similarities  do  play  an  important  role 
in  mapping  as  well.  In  particular  cross-mapping 
is  difficult  (Ross,  1989).  This  led  Holyoak  and 
Thagard  to  include  syntactic,  semantic,  and 
pragmatic  constraints  in  their  model  of  map¬ 
ping  ACME  (Holyoak  &  Thagard,  1989)  and 
to  develop  their  multi -constraint  theory  (Ho¬ 
lyoak  &  Thagard,  1995). 

There  are  also  some  indications  that  struc¬ 
tural  similarity  might  play  a  role  in  access  as 
well.  Thus  Ross  (1989)  demonstrated  that  in 


some  cases  (when  the  general  story  line  is  sim¬ 
ilar)  structural  similarity  plays  a  positive  role 
in  retrieval,  while  in  other  cases  (when  the  gen¬ 
eral  story  line  is  dissimilar)  it  does  not  play  any 
role  or  can  even  worsen  the  results.  The  results 
of  Wharton,  Holyoak,  and  Lange  (1996)  also 
support  indirectly  the  hypothesis  that  structur¬ 
al  correspondences  might  affect  the  access.  This 
was  reflected  in  the  models  being  proposed. 
Both  MAC/FAC  and  ARCS  included  a  sub- 
module  of  partial  mapping  in  the  module  of 
retrieval,  thus  considering  structural  similari¬ 
ties  at  an  early  stage. 

To  sum  up,  the  initial  separation  between 
retrieval  and  mapping  was  founded  on  their 
different  psychological  characteristics — seman¬ 
tic  factors  govern  the  retrieval,  structural  fac¬ 
tors  govern  the  mapping.  Subsequent  more  pre¬ 
cise  experiments,  however,  cast  doubt  on  this 
clear  separation.  These  complications  were 
accommodated  by  making  patches  to  the  orig¬ 
inal  models.  Finally,  it  was  acknowledged  that 
all  kinds  of  constraints  affected  all  phases  of 
analogy-making,  although  to  different  extent 
(Holyoak  &  Thagard,  1 995). 

The  experimental  data  themselves  became 
more  and  more  complex  and  controversial. 
These  controversies  can  be  explained  in  terms 
of  more  and  more  sophisticated  classifications 
of  the  types  of  similarities  involved  in  access 
and  mapping  (Ross,  1989;  Ross  &  Kilbane, 
1997).  We  argue,  however,  that  these  problems 
are  resolved  more  parsimoniously  by  adopting 
a  principally  different  view  of  analogy-making. 

This  resembles  an  episode  of  the  history  of 
astronomy.  The  geocentric  system  of  Ptolemy 
started  as  a  straightforward  theory  that  de¬ 
scribed  the  observable  movement  of  both  stars 
and  planets  remarkably  well.  As  accuracy  of 
measurement  increased,  however,  discrepancies 
between  theory  and  data  crept  in  every  now  and 
then.  It  became  routine  for  astronomers  to  deal 
with  such  ‘anomalies’  by  adding  more  and  more 
epicycles.  But  as  time  went  on,  it  became  evi¬ 
dent  that  astronomy’s  complexity  was  increas¬ 
ing  far  more  rapidly  than  its  accuracy  and  that 
a  discrepancy  corrected  in  one  place  was  likely 
to  show  up  in  another  (Kuhn,  1970). 
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Back  to  the  domain  of  analogy-making, 
most  classical  models  assume  sequential  pro¬ 
cessing:  firstihe  retrieval  process  finds  the  base 
for  analogy  and  then  the  mapping  process  builds 
the  correspondences  between  the  target  and  the 
retrieved  base  (Figure  1).  Thus  MAC/ 
FAC+SME  and  ARCS+ACME  are  linear  mod¬ 
els  separating  retrieval  and  mapping  in  time  and 
space.  This  view  underlies  most  of  the  experi¬ 
mental  work  in  the  field  as  well.  Researchers 
often  contrast  hint  versus  non-hint  conditions 
in  problem  solving  supposing  that  in  the  first 
case  only  mapping  takes  place,  while  in  the  sec¬ 
ond  retrieval  and  mapping  are  running  one  af¬ 
ter  the  other.  However,  as  Ross  (1989)  has  not¬ 
ed,  even  when  explicitly  hinted  to  use  a  certain 
analog  subjects  still  must  access  the  details  of 
its  representation.  Another  common  experimen¬ 
tal  technique  uses  a  memory  task  (typically  re¬ 
call)  for  studying  access  with  the  assumption 
that  the  same  processes  take  place  during  ana¬ 
logical  problem  solving. 

The  limitations  of  both  the  models  and  ex¬ 
perimental  methods  can  be  overcome  by  giv¬ 
ing  up  the  linearity  assumption.  This  might 
look  strange  at  first  glance — how  can  you  map 
the  source  analog  onto  the  base  if  you  have 
not  even  accessed  it?!  If,  however,  one  recon¬ 
siders  one  more  assumption — ^that  there  are 
centralized  representations  of  situations/prob¬ 
lems  in  human  memory — then  it  becomes 
clear  that  whenever  we  have  partial  retrieval 
of  the  base  (having  recalled  a  few  details)  we 
can  start  looking  for  corresponding  elements 
in  the  target.  This  allows  us  to  conceptualize 
access  and  mapping  as  parallel  processes  that 
can  interact  (Figure  2).  In  this  paradigm,  ac¬ 
cess  and  mapping  refer  not  to  phases  or  other 


behavioral  steps,  but  rather  to  separate  mech¬ 
anisms  that  both  play  a  role  in  selecting  and 
activating  a  base  and  in  finding  the  correspon¬ 
dences  between  base  and  target. 

The  current  paper  explores  the  implica¬ 
tions  of  the  parallel  and  interactive  view  on 
access  and  mapping  by  running  simulation  ex¬ 
periments  with  an  integrated  model  of  human 
(analogical)  reasoning  called  Amhr  (Kokinov, 
1988,  1994c,  Petrov,  1997).  These  experi¬ 
ments  provide  a  detailed  example  of  how  these 
two  processes  can  interact  and  thus  open  space 
for  new  theoretical  speculations  as  well  as  for 
new  experimental  paradigms.  AMnafts  predic¬ 
tions  about  the  development  of  the  process 
over  time  call  for  appropriate  experimental 
methods  capturing  the  dynamics  of  human 
analogy-making — RT  studies,  think-aloud 
protocols,  etc.  Some  of  the  controversies 
around  the  role  of  superficial  and  structural 
similarities  in  access  and  mapping  ‘phases* 
can  now  be  expressed  in  terms  of  the  interac¬ 
tions  between  the  two  mechanisms. 

A  very  important  contribution  of  the  sim¬ 
ulation  is  that  it  demonstrates  how  the  sup¬ 
posedly  later  ‘phase*  of  mapping  can  influ¬ 
ence  the  supposedly  earlier  ‘phase*  of  access. 
A  detailed  example  shows  how  the  access 
process  develops  over  time  and  how  it  is  in¬ 
fluenced  by  the  concurrent  mapping  process. 
This  is  contrasted  with  the  case  of  isolated 
access.  Different  results  are  obtained  in  the 
two  cases.  These  results  correspond  to  the 
data  of  Ross  and  Sofka  (unpublished)  which 
main  conclusions  arc  summarized  in  (Ross, 
1989)  as  follows:  *...  other  work  (Ross  & 
Sofka,  1986)  suggests  the  possibility  that  the 
retrieval  may  be  greatly  affected  by  the  use. 


time 


Accesr 


lapping 


time 


Figure  I.  Dominating  sequential  models  of  analogy¬ 
making. 


Figure  2.  Parallel  and  interactive  models  of  analogy‘ 
making. 
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In  particular,  we  found  that  subjects,  whose 
task  was  to  recall  the  details  of  an  earlier  ex¬ 
ample  that  the  current  test  problem  remind¬ 
ed  them  of,  used  the  test  problem  not  only  as 
an  initial  reminder  but  throughout  the  recall. 
For  instance,  the  test  problem  was  used  to 
probe  for  similar  objects,  and  relations  and 
to  prompt  recall  of  particular  numbers  from 
the  earlier  example.  The  retrieval  of  the  ear¬ 
lier  example  appeared  to  be  interleaved  with 
its  use  because  subjects  were  setting  up  cor¬ 
respondences  between  the  earlier  example 
and  the  test  problem  during  the  retrieval.’  The 
simulation  data  presented  in  the  current  pa¬ 
per  (obtained  absolutely  independently  and 
based  only  on  the  theoretical  assumptions  of 
Dual  and  Ambr)  exhibit  exactly  the  same  pat¬ 
tern  of  interaction. 

We  must  admit  that  even  in  a  highly  parallel 
and  interactive  model  such  as  Ambr  the  effects  of 
interactions  are  not  predominating.  In  the  major- 
k  ity  of  cases  the  independent  work  of  the  access 

j  mechanism  might  well  yield  the  same  results  as 

f  the  interaction  between  mapping  and  access  de- 

j  scribed  above.  That  is  why  the  classical  linear 

j  models  of  analogy  have  been  successful  and  have 

I  contributed  a  lot  to  our  understanding  of  human 

■  analogy-making.  However,  exactly  the  few  ex- 

l  ceptional  cases  that  do  provide  different  results 

;  in  a  parallel  model  are  the  more  interesting  and 

those  who  make  the  interpretation  of  the  experi- 
[  mental  data  look  controversial  if  analyzed  in  the 
/  spirit  of  the  sequential  models. 

There  are  a  few  other  models  that  advo¬ 
cate  a  parallel,  overlapping,  and  interactive 
view  on  analogy — Copycat  (Mitchell,  1993, 
Hofstadter,  1995),  Tabletop  (French,  1995, 
Hofstadter,  1995),  and  LISA  (Hummel  and 
Holyoak,  1997).  However,  Copycat  and  Ta¬ 
bletop  do  not  model  retrieval  at  all — they 
model  the  parallel  work  and  interaction  be- 
f  tween  perception/representation  building  and 

\  mapping.  LISA  also  integrates  access  and 
mapping  and  performs  them  in  parallel.  Thus 
j  the  mapping  mechanism  (connectionist  learn¬ 
ing  in  this  case)  influences  the  access.  As  a 
j  result,  LISA  could  in  principle  demonstrate 
!  effects  similar  to  those  reported  here. 


BRIEF  DESCRIPTION  OF  THE 

ARCHITECTURE  DUAL  AND  THE 
MODEL  AMBR 

The  basis  for  the  simulation  experiment 
discussed  in  this  paper  is  a  model  called  Ambr 
(Associative  Memory-Based  Reasoning).  It  is 
built  on  the  cognitive  architecture  Dual.  Space 
limitations  allow  only  an  extremely  sketchy 
description  of  Dual  and  Ambr  here.  The  inter¬ 
ested  reader  is  referred  to  earlier  publications 
(Kokinov,  1988,  1 994a, b,c;  Petrov,  1997). 

Dual  is  a  multi-agent  cognitive  architec¬ 
ture  that  supports  dynamic  emergent  computa¬ 
tion  (Kokinov, Nikolov,  and  Petrov,  1996).  All 
knowledge  representation  and  information  pro¬ 
cessing  in  the  architecture  is  carried  out  by  small 
entities  called  Dual  agents.  Each  DuAL-based 
system  consists  of  a  large  number  of  them. 
There  is  no  central  executive  in  the  architec¬ 
ture  that  controls  its  global  operation.  Instead, 
each  individual  agent  is  relatively  simple  and 
has  access  only  to  local  information,  interact¬ 
ing  with  a  few  neighboring  agents.  The  overall 
behavior  of  the  system  emerges  out  of  the  col¬ 
lective  activity  of  the  whole  population.  This 
‘society  of  mind’  (Minsky,  1986)  provides  a 
substrate  for  concurrent  processing,  interaction, 
and  emergent  computation. 

Each  Dual  agent  is  a  hybrid  entity  that  has 
symbolic  and  connectionist  aspects  (Kokinov 
1994a,b,c).  On  the  symbolic  side,  each  agent 
‘stands  for’  something  and  is  able  to  perform 
certain  simple  manipulations  on  symbols.  On 
the  connectionist  side,  it  sends/receives  activa¬ 
tion  to  and  from  its  immediate  neighbors.  Thus, 
we  may  adopt  an  alternative  terminology  and 
speak  of  nodes  and  links  instead  of  agents  and 
interactions.  The  population  of  agents  may  be 
conceptualized  as  a  network  of  nodes. 

The  long-term  memory  of  a  DuAL-based 
system  consists  of  the  network  of  all  agents  in 
that  system.  The  size  of  this  network  can  be 
very  large.  Only  a  small  fraction  of  it,  howev¬ 
er,  may  be  active  at  any  particular  moment.  The 
active  subset  of  the  long-term  memory  togeth¬ 
er  with  some  temporary  agents  constitutes  the 
working  memory  (WM)  of  the  architecture.  The 


127 


Alexander  A.  Petrov,  Bolcho  N.  Kokinov 


mechanism  of  spreading  activation  plays  a  key 
role  for  controlling  the  size  and  the  contents  of 
the  WM.  There  is  a  threshold  that  sets  the  min¬ 
imal  level  of  activation  that  must  be  obtained 
by  an  agent  to  enter  the  WM.  There  is  also  a 
spontaneous  decay  factor  that  pushes  the  acti¬ 
vation  levels  back  to  zero.  As  the  pattern  of 
activation  changes  over  time,  some  agents  from 
the  working  memory  fall  back  to  dormancy, 
others  are  activated,  etc.  Only  active  agents  may 
perform  symbolic  computation.  Moreover,  the 
speed  of  this  computation  depends  on  the  level 
of  activation  of  the  respective  agent.  This  makes 
the  computation  in  Duai>  dynamic  and  context- 
sensitive  (Kokinov  ct  al.,  1996;  Kokinov, 
1994a,b,c).  One  particular  consequence  of  this 
dynamic  emergent  nature  of  the  architecture  is 
that,  although  all  micro-level  processing  is 
strictly  deterministic,  the  macroscopic  behav¬ 
ior  of  a  Dual  system  can  be  described  only 
probabilistically. 

The  Ambr  model  takes  advantage  of  these 
architectural  features  to  account  for  some  phe¬ 
nomena  of  human  reasoning  and  in  particular 
reasoning  by  analogy  (Kokinov,  1988, 1994c). 
Again,  due  to  space  limitations  we  will  consid¬ 
er  only  a  small  fraction  of  model’s  mechanisms. 

Analog  access  in  Ambr  is  done  by  means 
of  spreading  activation  by  the  connectionist 
aspects  of  the  Duai.  agents.  In  particular,  only 
few  of  the  many  episodes  stored  in  the  long¬ 
term  memory  are  active  during  a  run  and  only 
they  are  accessible  for  processing.  The  episodes 
or  ‘situations*  have  decentralized  represen¬ 
tations — it  is  not  a  single  agent  but  a  whole 
coalition  that  represents  the  elements  of  a  situ¬ 
ation  and  the  relationships  among  them.  There¬ 
fore,  it  is  possible  that  an  episode  is  only  par¬ 
tially  accessed  because  only  some  of  the  agents 
have  entered  the  WM. 

The  process  of  analogical  mapping  is  done 
in  Ambr  by  a  combination  of  three  mecha¬ 
nisms — marker  passing,  constraint  satisfaction, 
and  structure  correspondence  (Kokinov,  1994c; 
Petrov,  1997).  The  main  idea  is  to  build  a  con¬ 
straint  satisfaction  network  ( CSM)  to  determine 
the  mapping  between  two  situations.  This  net¬ 
work  consists  of  hypothesis  agents  represent¬ 


ing  tentative  correspondences  between  two  el¬ 
ements.  Consistent  hypotheses  support,  and 
incompatible  ones  inhibit  each  other. 

This  is  similar  to  other  models  of  analogy¬ 
making  and  notably  ACME  (Holyoak  and 
Thagard,  1989).  Ambr  differs  from  the  latter 
model,  however,  in  several  ways:  (/)  the  CvSN 
is  constructed  dynamically,  (ii)  only  hypoth¬ 
eses  that  have  some  justification  are  created, 
(«/)  the  CSN  is  incorporated  into  the  bigger 
working  memory  network,  and  (iv)  there  is  no 
separate  relaxation  phase  so  there  is  a  partial 
mapping  at  each  moment. 

The  implication  of  these  four  points  is  that, 
unlike  ACME  and  most  other  analogy  models, 
the  processes  of  access  and  mapping  run  in  par¬ 
allel  and  influence  each  other  in  Ambr.  In  oth¬ 
er  words,  the  model  departs  from  the  classical 
‘pipeline*  paradigm  and  aims  at  a  more  inter¬ 
active  account  of  analogy  making. 

The  influence  between  the  two  sub¬ 
processes  in  Ambr  goes  in  both  directions.  The 
present  paper  concentrates  on  the  'backward’ 
direction — from  mapping  to  access.  The  next 
section  describes  a  simulation  experiment  that 
sheds  light  on  this  kind  of  influence. 

SIMULATION  EXPERIMENT  METHOD 

We  performed  a  simulation  experiment  to 
contrast  the  two  ways  of  combining  access  and 
mapping — parallel  vs.  serial.  The  experiment 
also  tested  whether  the  Ambr  model  was  capa¬ 
ble  to  access  a  source  analog  out  of  a  pool  of 
episodes,  and  to  map  it  onto  a  target  situation. 

Design 

The  experiment  consisted  of  two  conditions. 
Both  conditions  involved  running  the  model  on 
a  target  problem.  In  the  'parallel  condition*, 
Ambr  operated  in  its  normal  manner  with  the 
mechanisms  for  access  and  mapping  working  in 
parallel.  In  the  'serial  condition*,  the  program 
was  artificially  forced  to  work  serially — ^first  to 
access  and  only  then  to  map.  The  target  problem 
and  the  content  of  the  long-term  memory  were 
identical  in  all  runs.  The  topics  of  interest  fell 
into  two  categories — the  final  mapping  con- 
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stnicted  by  the  pro^am  and  the  dynamics  of  the 
underlying  computation.  The  latter  was  moni¬ 
tored  by  recording  a  set  of  variables  describing 
the  internal  state  of  the  system  at  regular  time 
intervals  throughout  each  run. 

Materials 

The  domain  used  in  the  experiment  deals 
with  simple  tasks  in  a  kitchen.  The  long-term 
memory  of  the  model  contains  semantic  and 
episodic  knowledge  about  this  domain.  It  has 
been  coded  by  hand  according  to  the  represen¬ 
tation  scheme  used  in  Dual  and  Ambr  (Koki- 
nov,  1994c;  Petrov,  1997).  The  total  size  of  the 
knowledge  base  is  about  500  agents  (300  ‘se¬ 
mantic*  +  200  ‘episodic’).  It  states,  for  exam¬ 
ple,  that  water,  milk,  and  tea  are  all  liquids,  that 
bottles  are  made  of  glass,  and  the  relation  ‘on’ 
is  a  special  case  of  ‘in-touch-with’.  The  LTM 
also  stores  the  representations  of  eight  situa¬ 


tions  related  to  heating  and  cooling  liquids.  Two 
of  these  eight  situations  are  most  important  for 
the  experiment  and  are  described  below  togeth¬ 
er  with  the  target  problem. 

As  evident  from  Figures  3,  4,  and  5,  both 
situations  A  and  B  may  be  considered  similar 
to  the  target  problem.  There  are  some  differ¬ 
ences,  however.  Situation  B  involves  the  same 
objects  and  relations  as  the  target  but  the  struc¬ 
ture  of  the  two  are  different.  In  contrast,  situ¬ 
ation  A  involves  different  objects  but  its  sys¬ 
tem  of  relations  is  completely  isomorphic  to 
that  of  the  target.  According  to  Gentner  (1989), 
the  pair  A-T  may  be  classified  as  analogy 
while  B-T  as  mere  appearance.  Thus  it  was 
expected  that  situation  B  would  be  easier  to 
retrieve  from  the  total  pool  of  episodes  stored 
in  LTM.  On  the  other  hand,  A  would  be  more 
problematic  to  retrieve  but  once  accessed  it 
would  support  better  mapping. 


Situation  A:  There  is  a  cup  and  some  wa¬ 
ter  in  it.  The  cup  is  on  a  saucer  and  is  made  of 
china.  There  is  an  immersion  heater  in  the  wa¬ 
ter.  The  immersion  heater  is  hot.  The  goal  is 
that  the  water  is  hot. 

The  outcome  is  that  the  water  is  hot.  This 
is  caused  by  the  hot  immersion  heater  in  it 


Figure  3.  Schematized  representation  of  situation  A. 
Objects  are  shown  as  boxes  and  relations  as  arrows. 
Dashed  arrows  stand  for  relations  in  the  ^outcome*.  The 
actual  Ambr  representation  is  more  complex — U 
consists  of  19  agents  and  explicates  the  cauSal  structure 
(not  shown  in  the  figure).  See  text  for  details. 


Situation  B:  There  is  a  glass  and  an  ice 
cube  on  it  The  glass  is  made  of  [material]  gla.ss. 
The  glass  is  in  a  fridge.  The  fridge  is  cold  The 
goal  is  that  the  ice  cube  is  cold. 

The  outcome  is  that  the  ice  cube  is  cold. 
The  fact  that  it  is  on  the  glass  and  the  glass  is 
in  the  fridge  entails  that  the  ice  cube  itself  is  in 
the  fridge.  In  turn,  this  causes  the  ice  cube  to 
be  cold,  as  the  fridge  is  cold. 


Figure  4.  Schematized  representation  of  situation  B. 
Dashed  arrows  stand  for  relations  in  the  *  outcome^.  The 
actual  Ambr  representation  is  more  complex — it 
consists  of  21  agents  and  explicates  the  causal  structure 
(not  shown  in  the  figure).  See  text  for  details. 
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Target  problem  (situation  T^:  There  is  a 
glass  and  some  coke  in  it.  The  glass  is  on  a 
table  and  is  mode  of  [material]  glass.  There  is 
an  ice  cube  in  the  coke.  The  ice  cube  is  cold. 
The  goal,  if  any,  is  not  represented  explicitly. 
What  is  the  outcome  of  this  state  of  affairs? 


Figure  5.  Schematized  representation  of  the  target 
situation.  The  actual  AMott  representation  is  more 
complex  and  consists  of  1$  agents.  See  text  for  details. 


Procedure 

The  Common  Lisp  implementation  of  the 
Ambr  model  was  run  two  times  on  the  target 
problem.  The  two  runs  carried  out  the  ‘paral¬ 
lel’  and  the  ‘serial*  conditions  of  the  experi¬ 
ment,  respectively.  The  contents  of  the  long¬ 
term  memory  and  the  parameters  of  the  model 
were  identical  in  the  two  conditions. 

Recall  that  situations  have  decentralized  rep¬ 
resentations  in  Ambr.  The  target  problem  was 
represented  by  a  coalition  of  15  agents  standing 
for  the  ice-cube,  the  glass,  two  instances  of  the 
relation  ‘in*  and  so  on.  12  of  these  agents  were 
attached  to  the  special  nodes  that  serve  as  acti¬ 
vation  sources  in  the  model.  The  attachment  was 
the  same  in  the  two  experimental  conditions. 

In  the  parallel  condition,  the  model  was  al¬ 
lowed  to  run  according  to  its  specification.  That 
is,  all  Ambr  mechanisms  ran  in  parallel,  inter¬ 
acting  with  one  another.  The  program  iterated 
until  the  system  reached  a  resting  state.  A  num¬ 
ber  of  variables  were  recorded  at  regular  inter¬ 


vals  throughout  the  run.  Out  of  these  many  vari¬ 
ables,  the  so-called  retrieval  index  is  of  special 
interest.  It  is  computed  as  the  average  activation 
level  of  the  agents  involved  in  each  situation. 

In  short,  at  the  end  of  the  run  we  had  the 
final  mapping  constructed  by  the  program  as 
well  as  a  log  file  of  the  retrieval  indices  of  all 
eight  situations  from  the  LTM. 

In  the  serial  condition,  the  target  problem  was 
attached  to  the  activation  source  in  the  same  way 
and  the  same  data  were  collected.  However,  the 
operation  of  the  program  w'as  forcefully  modi¬ 
fied  to  separate  the  processes  of  access  and  map¬ 
ping.  To  ^at  end,  the  run  was  divided  in  two  steps. 

During  step  one,  all  mapping  mechanisms 
in  Ambr  were  manually  switched  off.  Thus, 
spreading  activation  was  the  only  mechanism 
that  remained  operational.  It  was  allowed  to 
work  until  the  pattern  of  activation  reached  as¬ 
ymptote.  The  situation  with  the  highest  retrieval 
index  was  then  identified.  If  we  hypothesize  a 
‘retrieval  module*,  this  is  the  situation  that  it 
would  access  from  LTM. 

After  the  source  analog  was  picked  up  in 
this  way,  the  experiment  proceeded  with  step 
two.  The  mapping  mechanism  was  switched 
back  on  again  but  it  w'as  allow^ed  to  w'ork  only 
on  the  source  situation  retrieved  at  step  one. 
This  situation  was  mapped  to  the  target.  Thus, 
at  the  end  of  the  second  run  w'e  had  the  final 
mapping  constructed  at  step  tw'o,  as  w'ell  as  tw'o 
logs  of  the  retrieval  indices. 

RESULTS  AND  DISCUSSION 

In  both  experimental  conditions  the  model 
settled  in  less  than  150  lime  units  and  produced 
consistent  mappings.  By  ‘consistent*  we  mean 
that  each  element  of  the  target  problem  was  un¬ 
ambiguously  mapped  to  an  element  from  LTM 
and  that  all  these  corresponding  elements  be¬ 
longed  to  one  and  the  same  base  situation.  Stat¬ 
ed  differently,  the  mappings  w'erc  one-to-one  and 
there  were  no  blends  between  situations. 

In  the  parallel  condition,  the  target  prob 
lem  was  mapped  to  situation  A,  yielding  the 
correspondences  in-in,  wafer-coke, 
imm.heater-ice.cube,  Tof-T.of,  high.T-lnw.T, 
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made.of-made.ofy  etc.  Four  elements  from  the 
source  situation  remained  unmapped  and  in 
particular  the  agent  representing  that  the  water 
is  hot.  This  proposition  is  a  good  candidate  for 
inference  by  analogy.  Afwtarw  mutandis,  it  could 
bring  the  conclusion  that  the  coke  is  cold.  (In 
the  current  version  of  Ambr2  the  mechanisms 
for  analogical  transfer  are  not  implemented  yet.) 

In  the  serial  condition,  situation  B  won  the 
retrieval  stage.  This  is  explained  by  the  high 
semantic  similarity  between  its  elements  and 
those  of  the  target — ^both  deal  with  ice  cubes  in 
glasses,  cold  temperatures,  etc.  The  asymptot¬ 
ic  level  of  the  retrieval  index  for  B  was  about 
four  times  greater  than  that  of  any  other  situa¬ 
tion.  In  particular,  situation  A  ended  up  with 
only  5  out  of  19  agents  passing  the  working 
memory  threshold. 

According  to  the  experimental  procedure, 
situation  B  was  then  mapped  to  the  target  dur¬ 
ing  the  second  stage  of  the  run.  The  correspon¬ 
dences  that  emerged  during  the  latter  stage  are 
shown  in  Table  1.  The  semantic  similarity  con¬ 
straint  has  dominated  this  run.  This  is  not  sur- 


Situation  B 

Target  situation 

ice.cube 

ice.cube 

fridge 

coke 

glass 

glass 

in  (ice.cube,  fridge) 

in  (ice.cube,  coke) 

in  (glass,  fridge) 

in  (coke,  glass) 

on  (ice.cube,  glass) 

on  (glass,  saucer) 

T.of  (fridge,  low-T) 

<unmapped> 

T.of  (ice.cube,  low-T) 

T.of  (ice.cube,  low-T) 

low-T 

low-T 

made. of  (glass,  m.glass) 

made.of  (glass,  m.glass) 

m.glass 

m.glass 

initstatel 

initstate 

initstate2 

<unmapped> 

interstate 

table 

endstate 

endstate 

goalstate 

<unmapped> 

follows  (initstatel,  endst.) 

follows  (initstate,  endst.) 

to.reach  (initstatel,  goalst) 

<unmapped> 

cause  (initstate2,  in(i.c,fr)) 

<unmapped> 

cause  (interstate,  T.of(i.c)) 

<unmapped> 

Tahle  L  Correspondences  constructed  by  the  model  in 
the  serial  condition. 


prising  given  the  high  degree  of  superficial  sim¬ 
ilarity  between  the  two  situations.  There  is, 
however,  a  serious  flaw  in  the  set  of  correspon¬ 
dences.  The  proposition  ‘T.of  (ice.cube,  low- 
T)’,  which  belongs  to  the  initial  state  of  the  tar¬ 
get,  is  mapped  to  the  proposition  ‘T.of 
(ice.cube,  low-T)\  which  is  a  consequence  in 
the  source.  Therefore,  the  whole  analogy  be¬ 
tween  the  target  problem  and  the  situation  B 
could  hardly  generate  any  useful  inference. 

To  summarize,  when  the  mechanisms  for 
access  and  mapping  worked  together,  the  mod¬ 
el  constructed  an  analogy  that  can  potentially 
solve  the  problem.  On  the  other  hand,  when  the 
two  mechanisms  were  separated,  the  retrieval 
stage  favored  a  superficially  similar  but  in¬ 
appropriate  base. 

The  presentation  so  far  concentrated  on  the 
final  set  of  correspondences  produced  by  the 
model.  We  now  turn  to  the  dynamics  of  the 
computation  as  revealed  by  the  time  course  of 
the  retrieval  indices.  Figure  6  plots  the  retriev¬ 
al  indices  for  several  LTM  episodes  during  the 
first  run  of  the  program  (i.e.  when  access  and 
mapping  worked  in  parallel).  Figure  7  concen¬ 
trates  on  the  early  stage  of  the  first  run  and  com¬ 
pares  it  with  the  second  run  (i.e.  when  only  the 
access  mechanism  was  allowed  to  work).  Note 
that  the  two  plots  are  in  different  scales. 


0  .20  40  60  80  100  120 


Figure  6.  Plot  of  retrieval  indices  versus  time  for  the 
parallel  condition.  Situation  A  is  in  solid  line,  B  in 
dashed.  The  *south~wesP  corner  of  the  plot  is  repro¬ 
duced  in  Figure  7  with  threefold  magnification. 
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These  plots  tell  the  following  story:  At 
the  beginning  of  the  parallel  run,  several  sit¬ 
uations  were  probed  tentatively  by  bringing 
a  few  elements  from  each  into  the  working 
memory.  Of  this  lot,  B  looked  more  promis¬ 
ing  than  any  of  its  rivals  as  it  had  so  many 
objects  and  relations  in  common  with  the  tar¬ 
get.  Therefore,  about  half  of  the  agents  be¬ 
longing  to  situation  B  entered  the  working 
memory  and  began  trying  to  establish  corre¬ 
spondences  between  themselves  and  the  tar¬ 
get  agents.  The  active  members  of  the  rival 
situations  were  doing  the  same  thing,  al¬ 
though  with  lower  intensity.  At  about  1 5  time 
units  since  the  beginning  of  the  simulation, 
however,  situation  A  (with  the  immersion 
heater)  rapidly  gained  strength  and  eventual¬ 
ly  overtook  the  original  leader.  At  time  40,  it 
had  already  emerged  as  winner  and  gradual¬ 
ly  strengthened  its  dominance. 

The  final  victory  of  situation  A,  despite  its 
lower  semantic  similarity  compared  to  situa¬ 
tion  B,  is  due  to  the  interaction  between  the 
mechanisms  of  access  and  mapping  in  Ambr. 
More  precisely,  in  this  particular  case  it  is  the 
mapping  that  radically  changes  the  course  of 
access.  To  illustrate  the  importance  of  this  in¬ 
fluence,  Figure  7  contrasts  the  retrieval  indices 
with  and  without  mapping. 


Figure  7.  Retrieva!  indices  for  situations  A  and  B  with 
and  without  mapping  influence  on  access.  The  thick 
lines  correspond  to  the  parallet  condition  and  replicate 
(with  threefold  magnification)  the  lines  from  the  *south^ 
west*  corner  of  Fig.  6.  The  thin  lines  show  *pure* 
retrieval  indices.  See  text  for  details. 


The  thin  lines  in  Figure  7  show  the  re¬ 
trieval  indices  for  the  two  situations  when 
mapping  mechanisms  arc  suppressed.  Thus, 
they  indicate  the  ‘pure*  retrieval  index  of 
each  situation — the  value  that  is  due  to  the 
access  mechanism  alone.  The  index  for  situ¬ 
ation  B  is  much  higher  than  that  of  A  and, 
therefore,  B  was  used  as  source  w'hcn  the 
mapping  was  allowed  to  run  only  after  the 
access  had  finished. 

The  step-like  increases  of  the  plots  indi¬ 
cate  moments  in  which  an  agent  (or  usually  a 
tight  sub-coalition  of  two  or  three  agents)  pass¬ 
es  the  working  memory  threshold.  This  hap¬ 
pens,  for  example,  with  situation  B  between 
time  20  and  30  of  the  serial  condition  (the  thin 
dashed  line  in  Figure  7).  Thus,  accessing  a 
source  episode  in  Ambr  is  not  an  all-or-noth¬ 
ing  affair.  Instead,  situations  enter  the  work¬ 
ing  memory  agent  by  agent  and  this  process 
extends  far  after  the  beginning  of  the  mapping. 
In  this  way,  not  only  can  the  access  influence 
the  mapping  but  also  the  other  way  around. 

In  the  interactive  condition  the  mapping 
mechanism  boosted  the  retrieval  index  via  what 
we  call  a  ‘bootstrap  cascade’.  This  cascade  op¬ 
erates  in  Amur  in  the  following  way.  First,  the 
access  mechanism  brings  two  or  three  agents 
of  a  given  situation  into  the  working  memory. 
If  the  mapping  mechanism  then  detects  that 
these  few  agents  can  be  plausibly  mapped  to 
some  target  elements,  it  constructs  new  corre¬ 
spondence  nodes  and  links  in  the  Amrr  net¬ 
work.  This  creates  new  paths  for  the  highly  ac¬ 
tive  target  elements  to  activate  their  mates.  The 
latter  in  turn  can  then  activate  their  ‘coalition 
partners’,  thus  bringing  a  few  more  agents  into 
the  working  memory  and  so  on. 

The  bootstrap  cascade  is  possible  in  Amhr 
due  to  two  important  characteristics  of  this 
model.  First,  situations  have  decentralized  rep- 
re.sentations  which  may  be  accessed  piece  by 
piece.  Second,  Amrr  is  based  on  a  parallel  cog¬ 
nitive  architecture  which  provides  for  concur¬ 
rent  operation  of  numerous  interacting  process¬ 
es.  Taken  together,  these  two  factors  enable 
seamless  integration  of  the  subproccsses  of  ac¬ 
cess  and  mapping  in  analogy-making. 
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CONCLUSION 

The  simulation  experiment  reported  in  this 
paper  provides  a  clear  example  of  mapping  in¬ 
fluence  on  analog  access  and  of  the  advantages 
of  the  parallel  interactionist  view  on  analogy- 
making.  Furthermore,  the  computational  mod¬ 
el  Ambr  provides  a  theoretical  framework  for 
explaining  the  controversies  in  the  psychologi¬ 
cal  data  on  access  and  reminding.  It  is  possible 
to  explore  in  which  cases  the  interaction  be¬ 
tween  access  and  mapping  produces  results  dif¬ 
ferent  from  a  sequential  and  independent  pro¬ 
cessing.  It  provides  also  a  framework  for  gener¬ 
ating  more  precise  hypotheses  and  new  exper¬ 
imental  designs  for  their  testing.  Thus,  for  ex¬ 
ample,  the  detailed  logs  of  the  running  model 
might  be  used  for  comparison  with  protocols 
of  think-aloud  experiments. 

Analogy-making  has  certainly  no  clear  cut 
boundaries.  Most  literature  has  concentrated  on 
explicit  analogies,  i.e.  consciously  retrieving  an 
analog  and  noticing  the  analogy.  However,  there 
are  other  cases  which  might  be  called  implicit 
or  partial  analogies,  e.g.  subconsciously  ac¬ 
cessing  part  of  a  previously  solved  problem  and 
mapping  it  to  part  of  the  target  description  with¬ 
out  consciously  noticing  the  analogy.  The  de¬ 
centralized  representations  of  situations  in 
Ambr  make  it  possible  to  model  the  process  of 
partial  access,  access  with  distortions,  blend¬ 
ing  (Turner  &  Fauconnier,  1995),  and  inter¬ 
ference,  A  previously  solved  problem  can  in¬ 
fluence  the  course  of  problem  solving  in  an  even 
more  subtle  way  by  priming  some  concepts  or 
situations  which  then  trigger  a  particular  solu¬ 
tion  (Kokinov,  1990,  Schunn  and  Dunbar, 
1996).  The  Ambr  model  can  be  used  to  analyze 
such  cases.  It  has  already  been  successfully 
applied  for  predicting  priming  and  context  ef¬ 
fects  (Kokinov,  1994c). 

Priming  effects  are  an  example  of  the  in¬ 
fluence  of  access  on  mapping  which  is  the  op¬ 
posite  direction  of  the  one  discussed  in  the  cur¬ 
rent  paper.  Order  effects  are  another  kind  of 
effect  that  goes  in  ‘forward*  direction.  Such 
effects  may  be  due  to  non-simultaneous  per¬ 
ception  of  the  elements  of  the  target  problem 


(Keane,  Ledgeway,  &  Duff,  1994)  and/or  non- 
simultaneous  retrieval  of  relevant  pieces  of  in¬ 
formation  from  LTM,  Thus  the  mutual  influ¬ 
ence  between  analog  access  and  mapping  of¬ 
fers  many  opportunities  for  investigation. 
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ABSTRACT 

Recent  research  in  metaphor  and  analogy, 
as  variously  embodied  in  such  systems  as  Sap- 
per,  LISA,  Copycat  and  TableTop,  speak  to  the 
importance  of  three  principles  of  cross-domain 
mapping  that  have  received  limited  attention 
in,  what  might  be  termed,  the  classical  analo¬ 
gy  literature.  These  principles  are  that:  (i)  high- 
level  analogies  arise  out  of  nascent,  lower-lev¬ 
el  analogies  automatically  recognized  by  mem¬ 
ory  processes;  (ii)  analogy  is  memory-situated 
inasmuch  as  it  occurs  in  situ  within  the  vast  in¬ 
terconnected  tapestry  of  long-term  semantic 
memory,  and  may  potentially  draw  upon  any 
knowledge  fragment;  and  (iii),  this  memory- 
situatedness  frequently  makes  analogy  neces¬ 
sarily  dependent  on  some  form  of  attributive 
grounding  to  secure  its  analogical  interpreta¬ 
tions.  In  this  paper  we  discuss  various  argu¬ 
ments,  pro  and  con,  for  the  computational  and 
cognitive  reality  of  these  principles, 

D4TR0DUCT10N 

Over  the  last  few  years,  we  have  been  ex¬ 
amining  the  computational  capabilities  of  mod¬ 
els  of  analogy  (see  Veale  &  Keane,  1993, 1994, 
1997;  Veale^/fl/.,  1996).  Some  models  of  anal¬ 
ogy,  like  the  original  version  of  the  Structure- 
Mapping  Engine  (SME;  Falkenhainer,  Forbus 
&  Centner,  1989),  have  been  concerned  with 


producing  optimal  solutions  to  the  computa¬ 
tional  problems  of  structure  mapping,  although 
more  recently,  many  models  have  adopted  a 
more  heuristic  approach  to  improve  perfor¬ 
mance  at  the  expense  of  optimality;  models  like 
the  Incremental  Analogy  Machine  (lAM;  Keane 
&  Brayshaw,  1988;  Keane  et  al,  1994),  the 
Analogical  Constraint  Mapping  Engine 
(ACME;  Holyoak  &  Thagard,  1989),  Greedy- 
SME  (see  Forbus  &  Oblinger,  1990)  and  Incre- 
mental-SME  (see  Forbus,  Ferguson  Sc  Centner, 
1994).  Thest  classical  structure-mapping  mod¬ 
els  have  also  been  predominantly  concerned 
with  modelling  the  details  of  a  corpus  of  psy¬ 
chological  studies  on  analogy. 

In  contrast,  there  is  a  different  non-clas¬ 
sical  tradition  that  has  concentrated  on  cap¬ 
turing  key  properties  of  analogising,  with  less 
reference  to  the  mainstream  psychological  lit¬ 
erature  (e.g.,  the  Copycat  system  of  Hofts- 
tadter  et  al.  1995;  the  TableTop  system  of 
Hofstadter  &  French,  1995;  and  the  AMBR 
system  of  Kokinov,  1994).  Recently,  there 
has  been  something  of  a  confluence  of  these 
two  traditions  as  models  have  emerged  that 
exhibit  many  of  the  parallel  processing  prop¬ 
erties  of  non-classical  approach  with  the  com¬ 
putational  and  empirical  constraints  of  clas¬ 
sical  models;  models  like  Sapper  (see  Veale 
&  Keane,  1993,  1994,  1997;  Veale  et  al, 

1996)  and  LISA  (see  Hummel  and  Holyoak, 

1997) .  While  these  models  are  clearly  differ- 
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(i)  The  Triangulation  Rule 


Orchestra 


Army 


Figure  1.  The  Trlangulatlon  Rule  (I)  and  the  Squaring  Rule  00  augment  semantic  mcmnr>'  with  additional  bridges 

(denoted  Af),  Indicating  potential  future  mappings. 


ent  to  classical  models,  it  is  not  immediately 
obvious  whether  they  are  just  algorithmic 
variations  on  the  same  computational-level 
theme,  or  whether  they  constitute  a  signifi¬ 
cant  departure  regarding  the  principles  of 
analogy.  In  this  paper,  using  Sapper  as  a  fo¬ 
cus,  we  argue  that  there  are  at  least  three  prin¬ 
ciples  on  which  Sapper  differs  from  wholly 
classical  models.  We  also  argue  from  a  com¬ 
putational  perspective  that  Sapper  offers  sev¬ 
eral  performance  efficiencies  over  optimal 
and  sub-optimal  classical  models. 

PRINCIPAL  DIFFERENCES 

Sapper  accepts  most  of  the  computational - 
level  assertions  made  about  structure  mapping, 
such  as  the  importance  of  isomorphism,  struc¬ 
tural  consistency  and  systematicity  (see  Keane 
etal,  1994,  for  a  computational-level  account). 
A  ongoing  discussion  with  several  researchers 
in  the  field  has  helped  to  define  its  differences 


in-principle  from  classical  models  (c.f  Fergu¬ 
son,  Forbus  &  Centner,  1997;  Thagard,  1997). 
In  summary,  they  arc  that: 

•  Analogies  are  forever  nascent  in  human 
memory:  that  human  memory  is  continual¬ 
ly  preparing  for  future  analogies  by  estab¬ 
lishing  potential  mappings  between  do¬ 
mains  of  knowledge. 

•  Mapping  is  memory  •situated:  that  mapping 
occurs  within  a  richly  elaborated,  tangle  of 
conceptual  knowledge  in  long-term  memo¬ 
ry 

•  Attributes  are  important  to  mapping:  that 
attribute/category  information  plays  a  cru¬ 
cial  role  in  securing  both  the  relevance 
tractability  of  an  analogical  mapping. 

At  present,  the  psychological  literature  is 
silent  on  many  of  these  points.  In  this  paper, 
we  address  these  issues  by  outlining  each  of 
the  principles  in  more  detail  and  evaluating  the 
computational  and  psychological  evidence  of 
relevance  to  them. 
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NASCENT  ANALOGIES 

The  picture  Sapper  creates  of  the  analogy 
process  is  quite  different  from  the  goal-driven, 
just-in-time  construction  of  analogies  associated 
with  the  classical  models.  In  the  classical  tradi¬ 
tion,  all  analogising  occurs  when  current  process¬ 
ing  demands  it,  a  proposal  that  is  most  obvious  in 
the  centrality  given  to  pragmatic  constraints  (see 
Holyoak  &  Thagard,  1989;  Keane  1985;  Forbus 
&  Oblinger,  1990).  In  these  models,  mappings  are 
constructed  when  the  system  goes  into  “analogy 
mode”  and  are  not  prepared  in  advance  of  an  anal¬ 
ogy-making  session.  In  contrast,  Sapper  models 
analogy-making  as  a  constant  background  activi¬ 
ty  where  potential  mappings  are  continually  and 
pro-actively  prepared  in  memory,  to  be  exploited 
when  particular  processing  goals  demand  them 
to  be  used.  Analogies  are  thus  forever  nascent  in 
Sapper’s  long-term  memory. 

Sapper  forms  analogies  using  spreading- 
activation  within  a  semantic  network  model  of 
long-term  memory,  by  exploiting  conceptual 
bridges  that  have  been  established  between 
concepts  in  this  network.  These  bridges  record 
potential  mappings  between  concepts  and  are 
automatically  added  by  Sapper  to  its  semantic 
network  when  the  structural  neighbourhoods  of 
two  concepts  share  some  local  regularity  of 
structure.  Such  bridges  are  highly  tentative 
when  initially  formed,  and  thus  remain  dormant 
inasmuch  as  they  are  not  used  by  “normal” 
spreading  activation  in  the  network.  But  dor¬ 
mant  bridges  can  be  awakened,  and  subsequent¬ 
ly  used  for  spreading  activation,  when  some 
proposed  analogical  correspondence  between 
the  concepts  is  made  by  the  cognitive  agent. 

The  regularities  of  stmcture  which  Sapper 
exploits  to  recognize  new  bridge-sites  in  long-term 
memoiy  are  captured  in  two  rules  that  are  graph¬ 
ically  illustrated  in  Figure  1 :  the  triangulation  and 
squaring  rules.  Th&triangulation  rule  asserts  that: 


If  memory  already  contains  two  linkages  Ly  and 
of  semantic  type  L  forming  two  sides  of  a  triangle 
between  the  concept  nodes  C,,  Cj  and  C^,  then 
complete  the  triangle  and  augment  memory  with 
a  new  bridge  linkage 


For  example,  in  Figure  1  (i),  when  concepts 
BATON  and  SABRE  have  the  shared  predicates 
LONG  and  HANDHELD  the  triangulation  rule 
will  add  a  bridge  between  them,  which  may 
subsequently  be  exploited  by  an  analogy.  In 
predicate  calculus  notation,  this  could  be  inter¬ 
preted  as  asserting  that  when  two  concepts  par¬ 
take  in  two  or  more  instances  of  predications 
which  are  otherwise  identical,  they  become 
candidates  for  an  analogical  mapping,  e.g.,  that 
long(BATON)  &  handheld( baton)  and  long(sABRE) 
&  handheld(sABRE)  suggest  that  baton  and  sa¬ 
bre  are  candidates  for  an  entity  mapping  in  a 
later  analogy.  Memory  is  thus  seen  by  Sapper 
as  pro-actively  exploiting  perceptual  similari¬ 
ties  to  pave  the  way  for  future  structural  analo¬ 
gies  and  metaphors;  much  like  Hofstadter  & 
French  (1995)  then.  Sapper  views  analogy  and 
metaphor  as  outcrops  of  low-level  perception. 

The  structural  integrity  of  these  analogical 
outcrops  is  enforced  by  the  squaring  rule,  which 
works  at  a  higher  level  over  collections  of  bridg¬ 
es  between  concepts: 

If  Is  a  conceptual  bridge,  and  if  there  already 
exists  the  linkages  L^,  and  of  the  predicate  type 
L,  forming  three  sides  of  a  square  between  the 
concept  nodes  C,,  C^,  and  C^,  then  complete  the 
square  and  augment  long-term  memory  with  a 
new  bridge  linkage  B^^^. 

For  example,  in  Figure  l(ii)  the  bridges 
established  using  tri angulation  between  percus¬ 
sion  ->  ARTILLERY  and  DRUM  ->  CANNON,  SUppOlt 
the  formation  of  an  additional  bridge  between 
ORCHESTRA  and  ARMY  usmg  the  squaring  rule.  The 
intuition  here  is  that  correspondences  based  on 
low-level  semantic  features  can  support  yet 
higher-level  correspondences  (see  Hofstadter 
al.  1995;  Hummel  &  Holyoak,  1997). 

The  proposal  that  analogies  are  forever  na¬ 
scent  in  human  memory  may  seem  computation¬ 
ally  implausible  because  it  suggests  a  prolifera¬ 
tion  of  conceptual  bridges  that  would  quickly 
overwhelm  our  memories  with  irrelevant  con¬ 
ceptual  structure.  In  practice,  this  does  not  seem 
to  be  the  case.  In  performance  experiments,  we 
have  shown  that  as  a  knowledge-base  grows  so 
too  does  the  number  of  bridges,  but  in  a  polyno- 
mially  modest  fashion  (see  Veale  et  al  1996). 
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Indeed,  the  notion  of  a  conceptual  bridge  is  a 
compelling  one  that  seems  to  have  emerged  in¬ 
dependently  from  multiple  researchers  in  the 
field  (e.g.,Veale& Keane,  1993;  Eskridge,  1994; 
Hofstadter  et  al,  1995).  From  a  psychological 
perspective,  some  have  argued  that  forming  po¬ 
tential  mappings  in  advance  of  an  analogy  is 
implausible  (e.g.,  see  Ferguson  et  al.,  1997). 
While  we  know  of  no  evidence  that  directly  sup¬ 
ports  or  denies  the  bridging  stance,  it  does  gel 
with  certain  broad  phenomena.  The  inherent 
flexibility  and  speed  of  people’s  analogical  map¬ 
ping,  even  within  relatively  large  domains,  sug¬ 
gests  that  some  pre-compiled  correspondences 
are  used,  otherwise  the  mapping  problem  ap¬ 
proaches  intractability;  this  is  especially  so  when 
slippage  and  re-representation  in  these  domains 
is  also  implicated.  Similarly,  Hofstadter  and  his 
team’s  characterisation  of  people’s  alacrity  in 
performing  conceptual  slippage  between  differ¬ 
ent  entities  is  more  consistent  with  this  account 
than  classical  models  would  be. 

MAPPING  IS  MEMORY-SITUATED 

Sapper  sees  the  mapping  process  as  being 
essentially  memory-situated,  that  is,  that  the  gen¬ 
eration  of  mapping-rich  interpretations  can  only 
be  carried  out  within  a  long-term  memory  of  richly 
interconnected  concepts.  In  character,  this  is  quite 
different  to  classical  models  which  see  analogues 
as  delineated  bundles  of  knowledge,  segregated 
parcels  of  predications  that  are  retrieved  from 
memory  and  mapped  in  “another  place*’  (usually 
a  temporary  woilcing  memory).  In  some  cases,  this 
knowledge-bundling  seems  more  plausible  than 
in  others.  For  instance,  it  makes  some  sense  in  the 
encoding  of  episodic  event  sequences  (typically, 
used  in  bench-marking  analogy  models),  although 
even  in  these  cases  many  of  the  properties  of  ob¬ 
ject-centred  concepts  (i.e.,  those  typically  ex¬ 
pressed  at  a  linguistic  level  via  nouns  rather  than 
verbs)  seem  to  be  unnaturally  suppressed.  This 
bundling  makes  less  sen.se  in  other  cases,  as  in 
the  profession  domains  used  in  Sapper  where  ob¬ 
jects  (such  as  General,  Surgeon,  Scai.pel,  Army, 
etc.)  arc  the  focal  points  of  the  representation,  and 
relations  are  hung  between  them.  In  turn,  this  has 


led  to  the  objection  that  Sapper’s  test  domains 
inappropriately  include  “the  whole  of  semantic 
memory”  in  the  domain  representation  (c.f. 
Thagard,  1997).  We  would  argue  that  this  is  en¬ 
tirely  the  point;  natural  analogy  is  performed  with¬ 
in  large,  elaborated  domains  involving  many  pred¬ 
icates  with  few  clear  boundaries  on  relevance. 
Since  clever  analogies  and  metaphors  suq^rise  and 
delight  us  by  the  unexpected  ways  in  which  they 
relate  the  dissimilar,  the  mapping  device  is  fre¬ 
quently  itself  the  relevance  mechanism.  Let’s  con¬ 
sider  then  how  Sapper  forms  analogies  in  a  mem¬ 
ory-situated  fashion. 

Sapper  performs  analogical  mapping  by 
spreading  activation  through  its  semantic  mem¬ 
ory,  pin-pointing  cross-domain  bridges  that 
might  potentially  contribute  to  a  final  interpre¬ 
tation  (see  Appendix  A  for  the  algorithm).  The 
algorithm  first  performs  a  bi-dircctional 
breadth -first  search  from  the  root  nodes  of  the 
source  (S)  and  target  (T)  domains  in  memory, 
to  seek  out  all  relevant  bridges  that  might  po¬ 
tentially  connect  both  domains  and  thus  finds 
an  intermediate  set  of  candidate  matches  (or 
pmaps,  in  SME  parlance).  To  avoid  a  combi¬ 
natorial  explosion,  this  search  is  limited  to  a 
fixed  horizon  H  of  relational  links  (usually  H  = 
6)  while  employing  the  same  predicate  identi- 
cality  constraint  as  SME  for  determining  struc¬ 
tural  isomorphism.  Then,  the  richest  pmap  (i.e., 
the  pmap  containing  the  largest  number  of 
cross-domain  mappings)  is  chosen  as  a  seed  to 
anchor  the  overall  interpretation,  while  other 
pmaps  are  folded  into  this  seed  if  they  arc  con¬ 
sistent  with  the  evolving  interpretation,  in  de¬ 
scending  order  of  the  richness  of  those  pmaps 
(in  a  manner  that  corresponds  closely  to 
Greedy-SME)’ .  The  use  of  memory  situated¬ 
ness  in  combination  with  the  other  features  of 
Sapper  delivers  effective  performance  on  map¬ 
ping  these  analogies. 

Tests  of  Sapper  relative  to  other  models 
have  been  performed  on  a  corpus  of  1 05  meta¬ 
phors  between  profession  domains  (e.g.,  “A 
Surgeon  is  a  BuratE.R*'),  where  these  domains 
contain  an  average  of  1 20  predications  each  (on 
average,  70  of  these  are  attributional,  coding 
taxonomic  position  and  descriptive  properties). 
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ACME 

Sapper 

(VanUkt) 

Sapper 

Avg,  Number  of 

mid-level  omaos 

269 

oer  metaohor 

269 

12,657 

18 

m 

Average  Run-Time 

per  Metaphor 

N/A  -  worst  case 

0(2^^^)  seconds 

17* 

Seconds 

N/A  in 
time-frame 

12.5* 

seconds 

720* 

seconds 

*  Running  on  a  166  MHz  Pentium  •  Running  on  a  SPARC  2 


Table  /.  Comparaitive  Run-Time  Evaluation  of  SME  and  ACME  and  Sapper. 


Sapper’s  long-term  memory  for  these  profes¬ 
sion  domains  is  coded  via  a  semantic  network 
of  300+  nodes  with  just  over  1 ,600  inter-con-  * 
cept  relations.  Table  I  shows  that  Sapper  per¬ 
forms  better  than  other  classical  models  in  these 
domains  (SME  and  ACME  return  no  results  for 
many  examples  in  an  extended  time-frame, 
though  Greedy-SME  fares  much  better),  three 
caveats  should  be  stated  to  qualify  these  results. 
Firstly,  although  the  average  pmap  measure¬ 
ment  for  Optimal-SME  is  clearly  quite  poor  (in¬ 
asmuch  as  it  over-complicates  the  interpreta¬ 
tion  process  immensely),  it  does  underestimate 
its  adequacy  on  some  individual  metaphors;  as 
Ferguson  (1997)  has  noted,  Optimal-SME  can 
map  some  metaphors  with  smaller  pmap  sets, 
e.g.,  Hacker  as  Sculptor  from  49  pmaps  in 
1,077  seconds.  Accountant  as  Sculptor  from 
43  pmaps  in  251  seconds,  and  Butcher  as  Sculp¬ 
tor  from  47  pmaps  in  443  seconds.  Second, 
other  models  can  do  better  if  they  use  tailored 
re-representations  of  Sapper’s  domains  (in 
which,  for  example,  attributions  are  ignored), 
but  this  raises  problems  as  to  the  theoretical 
import  of  such  re-representations.  Third,  these 
results  establish  whether  the  tested  models  can 
find  some  interpretation  for  a  given  metaphor 
but  they  say  nothing  about  quality  of  the  anal¬ 
ogy  returned. 


‘  Ferguson  et  al.  ( 1 997)  have  argued  that  Sapper  can¬ 
not  exploit  matches  based  on  extended  chains  of  relational 
links.  While  many  of  the  chains  are  quite  short  in  the  pro¬ 
fessions  domains,  recent  tests  have  shown  that  Sapper  has 
no  difficulties  with  longer  chains. 


For  each  test  metaphor,  there  is  an  optimal 
set  of  cross-domain  matches,  so  to  assess  the 
quality  of  a  given  interpretation,  one  needs  to 
note  how  many  of  the  produced  matches  actu¬ 
ally  intersect  with  this  optimal  set  (as  generat¬ 
ed  by  the  exhaustive  variant  of  Sapper  profiled 
in  Table  I),  taking  into  account  the  number  of 
“ghost  mappings”  (i.e.,  matches  included  in  the 
interpretation  that  should  not  have  been  gener¬ 
ated). 

Table  II  shows  some  quality  results  for  the 
more  efficient  structure  mappers,  Vanilla  Sap¬ 
per  and  Greedy-SME  (Greedy-SIM  is  our  sim¬ 
ulation  of  Greedy-SME  earlier  reported  in 
Veale  &  Keane,  1997,  and  Greedy-SME  is 
based  on  an  analysis  of  the  outputs  provided  to 
us  by  the  SME  Group).  Three  measures  of  qual¬ 
ity  are  used  (borrowing  some  terms  from  the 
field  of  information  retrieval,  e.g.,  Van  Rijs- 
bergen,  1979).  Recall  is  the  total  number  of 
optimal  mappings  generated  measured  as  a  per¬ 
centage  of  the  total  number  of  optimal  map¬ 
pings  available.  Precision  is  the  number  of  op¬ 
timal  mappings  generated  measured  as  a  per¬ 
centage  of  the  total  number  of  optimal  map¬ 
pings  generated  by  the  model.  Recall  indicates 
the  productivity  (or  under-productivity)  of  a 
model,  while  precision  indicates  over-produc¬ 
tivity  (or  the  propensity  to  generate  “ghost  map¬ 
pings”).  Finally,  we  measured  the  percentage 
of  times  a  perfect,  optimal  interpretation  was 
produced  by  the  model. 

The  results  shown  in  Tables  I  and  II  lead 
one  to  conclude  that  while  Sapper  and  Greedy- 
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Aspekt 

Greedy  SIM 

Merge  Complexity 

BFlUIBIWIlHm 

Precision 

95% 

56% 

60% 

Resall 

95% 

72% 

72% 

77% 

0% 

0% 

Tahte  11.  Quality  of  inetrpretation  Generated  by  of  Sapper,  Greedy  -  SIM  and  Greedy  •  SMF. 


SME  take  roughly  the  same  time  to  process  met¬ 
aphors,  the  quality  of  the  latter  lags  behind  the 
former.  Our  analyses  suggest  that  the  specific 
features  underlying  the  proposed  principles  con¬ 
tribute  to  Sapper’s  better  performance,  namely; 
its  pre-preparation  of  potential  mappings  in 
memory,  the  use  of  a  richly  elaborated  semantic 
memory  and  its  exploitation  of  low-level  simi¬ 
larity  (the  final  issue  to  which  we  now  turn). 

ATTRIBUTES  ARE  IMPORTANT 

The  third  main  difference  in  principle  that 
emerges  from  Sapper  is  its  emphasis  on  attribute 
knowledge  (also  a  cornerstone  of  the  FARG 
models  of  Hofstadter  et  a!. ,  1 995).  For  Sapper, 
attribute  knowledge  is  always  necessary  to 
ground  the  mapping  process,  whereas  in  non- 
classical  models  it  tends  to  be  merely. 

A  central  tenet  of  structure  mapping  theo¬ 
ry  (see  Centner,  1983)  is  that  analogy  rests  on 
relational  rather  than  attribute  mappings,  al¬ 
though  the,  sometimes  misleading,  influence  of 
attribute  mappings  have  been  well -recognised 
(Centner,  Ratterman  &  Forbus,  1993;  Centner 
&  Toupin,  1986;  Keane,  1985;  Markman  & 
Centner,  1993).  Originally,  in  Optimal-SME, 
analogies  were  found  using  analogy  match-rules 
which  explicitly  ignored  attribute  correspon¬ 
dences  (unless  they  the  arguments  to  relational 
matches;  seeFalkenhaineret  al.,  1989)  and  lit¬ 
eral  ly-similar  comparisons  were  handled  by  lit¬ 
eral-similarity  rules  that  matched  both  relations 
and  attributes.  More  recently,  SME  uses  liter¬ 
al-similarity  rules  for  both  analogies  and  liter- 
ally-similar  comparisons  (see  e.g.,  Markman  & 
Centner,  1993;  Forbus,  Centner  &  Law,  1995). 
So,  if  a  comparison  yields  mainly  systematic 
relational  matches  then  it  is  an  analogy,  where¬ 


as  if  it  yields  more  attribute  than  relational 
matches  then  it  is  literally  similar.  However, 
even  though  literal -similarity  rules  arc  used, 
attribute  information  is  typically  only  sufficient 
in  the  formation  of  analogies,  rather  than  nec¬ 
essary.  If  attribute  matches  arc  absent  then  SME 
will  find  a  systematic  relational  interpretation 
for  the  two  domains,  and  if  they  are  present  then 
it  will  find  the  same  systematic  relational  in¬ 
terpretation  along  with  any  consistent  attribute 
matches  * . 

In  contrast.  Sapper  proposes  a  strong  caus¬ 
al  role  for  the  grounding  of  high-level  corre¬ 
spondences  in  initial  attribute  correspondenc¬ 
es.  This  model  will  simply  not  find  any  match¬ 
es  unless  they  are,  in  some  way,  grounded  in 
attribute  knowledge.  The  triangulation  rule  es¬ 
tablishes  a  candidate  set  of  mappings  using  cat¬ 
egory  information  that  anchors  the  later  con¬ 
struction  of  the  analogy,  so  that  correspondenc¬ 
es  established  by  the  squaring  rule  arc  built  on 
the  bridges  found  by  the  triangulation  rule. 
Thus,  Sapper  assumes  that  categories  exist  to 
enable  people  to  infer  shared  causal  properties 
among  objects. 

There  are  several  psychological  and  com¬ 
putational  observations  that  support  this  em¬ 
phasis  on  the  importance  of  attributes.  First,  as 
we  already  know,  human  memory  has  a  ten¬ 
dency  to  retrieve  analogues  with  have  attribute 
overlap  (sec  e.g.,  Keane,  1987;  Centner,  Riit- 


*  This  not  to  deny  th.it  nttrihutc  m.itchcs  can  be  nec¬ 
essary  to  finding  an  an.itogy.  For  example,  if  there  are  two 
competing  relational  interpretations  with  equal  systematic- 
ity,  then  attribute  matches  could  tip  the  balance  in  favour 
of  one  Similarly,  in  cross -mappings,  attribute  matches  ctm 
misdirect  the  comparison  process  (cf.  M.iriman  &  Cent¬ 
ner,  1993).  HouTver,  our  intuition  is  th.it  these  situations 
arc  the  exception  rather  than  the  rule 
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terman  &  Forbus,  1993;  Holyoak  &  Koh,  1987), 
which  must  mean  that  many  everyday  analolg- 
ies  rely  heavily  on  attribute  overlaps  (unlike  the 
attribute -lite  analogies  used  to  illustrate  most 
analogies,  like  the  atom/solar  system  and  heat- 
flow/water-flow  examples). 

Second,  category  information  constrains 
the  computational  exercise  of  finding  a  struc¬ 
ture  mapping.  When  reasoning  about  two  ana¬ 
logical  situations,  people  will  intuitively  seek 
to  map  elements  within  categories;  for  instance, 
when  mapping  Irangate  to  Watergate,  presidents 
will  map  to  presidents,  patsies  to  patsies,  re¬ 
porters  to  reporters,  and  so  on.  With  these  ini¬ 
tial,  tentative  mappings  in  mind,  the  structure¬ 
mapping  exercise  that  follows  may  be  greatly 
curtailed  in  its  combinatorial  scope  (for  sup¬ 
porting  psychological  evidence  see  Goldstone 
&  Medin,  1994;  Ratcliff  &  McKoon,  1989). 

Third,  the  triangulation  of  attributive  in¬ 
formation  allows  Sapper  to  model  an  impor¬ 
tant  aspect  of  metaphor  interpretation  that  has 
largely  been  ignored  in  most  classical  struc¬ 
ture-mapping  models,  mxneAy  domain  incon¬ 
gruence  (Ortony,  1979;  Tourangeau  &  Stern¬ 
berg,  1981).  The  same  attribute  can  possess 
different  meanings  in  different  domains  and 
this  plurality  of  meaning  serves  to  ground  a 
metaphor  between  these  domains.  For  in¬ 
stance,  when  one  claims  that  a  “tie  is  too 
loud”,  the  attribute  LOUD  is  being  used  in 
an  acoustic  and  a  visual  sense;  a  GARISH  tie 
is  one  whose  colours  invoke  a  visual  coun¬ 
terpart  of  the  physical  unease  associated  with 
loud,  clamorous  noises.  But  for  LOUD  to  be 
seen  as  a  metaphor  for  GARISH  such  at¬ 
tributes  must  possess  an  internal  semantic 
structure  to  facilitate  the  mapping  between 
both.  That  is,  attributes  may  possess  attributes 
on  their  own  (e.g.,  both  LOUD  and  GARISH 
may  be  associated  with  SENSORY,  IN¬ 
TENSE  and  UNCOMFORTABLE).  The  di¬ 
vision  between  structure  and  attribution  is  not 
as  clean  a  break  then  as  classical  models  pre- 


^  As  a  contrasting  view,  note  that  Tourangeau  &  Stem- 
berg  (1981)  argue  that  aptness  is  based  on  attribution  and 
domain  incongruence. 


diet;  rather  structure  blends  into  attribution 
and  both  should  be  handled  homogenously. 
This  homogeniety  is  perhaps  one  of  the  stron¬ 
gest  features  of  non-classical  models. 

This  asserted  centrality  of  attribute  infor¬ 
mation  in  the  mapping  process  may  seem  to 
be  contradicted  by  evidence  of  aptness  ratings 
on  analogy,  which  show  that  apt  analogies 
have  few  attribute  overlaps  (see  Gentner  and 
Clement,  1988;  also  soundness,  see  Gentner, 
Ratterman  &  Forbus,  1 993)  ,  However,  there 
is  a  possibility  that  these  ratings  may  just  re¬ 
flect  a  folk  theory  of  analogy.  More  plausi¬ 
bly,  since  we  argue  that  the  role  of  attributes 
is  to  ground  high-level  structure  in  low-level 
preception,  the  effect  of  this  grounding  may 
not  be  apparent  to  subjects,  particularly  when 
this  grounding  occurs  at  a  significant  recur¬ 
sive  remove  (e.g.,  H  =  5).  Ultimately  then, 
these  aptness  ratings  may  tell  us  nothing  about 
what  actually  facilitates  the  process  of  struc¬ 
tural  mapping. 

CONCLUSIONS 

In  this  paper,  we  have  tried  to  show  that  a 
very  different  computational  treatment  of  struc¬ 
ture  mapping  in  a  localist  semantic-memory  di¬ 
verges  from  so-called  classical  models  of  analo¬ 
gy  in  three  important  respects.  Models  like  Sap¬ 
per  promote  the  idea  that  memoiy  is  continu¬ 
ously  laying  the  groundwork  for  analogy  forma¬ 
tion,  that  analogical  mapjping  should  be  memo¬ 
ry-situated,  and  that  attribute  correspondences 
play  a  key  role  in  the  mapping  process.  Compu¬ 
tationally,  it  is  clear  that  at  least  one  instantia¬ 
tion  of  these  ideas  does  a  very  good  job  at  deal¬ 
ing  with  the  computational  intractability  of  struc¬ 
ture  mapping,  albeit  in  a  sub-optimal  fashion. 
Our  experiments,  both  on  our  own  profession 
domain  metaphors  (in  which  Sapper  out-per- 
forms  other  models)  and  the  benchmark  analo¬ 
gies  of  other  models  (such  as  Karla  >45  Zerdia 
and  Socrates  as  Midwife,  where  Sapper  does  at 
least  as  well  as  SME  and  ACME),  suggest  that 
of  all  the  attempts  at  sub-optimal  mappings  it 
seems  to  offer  the  best  all-round  performance. 
Psychologically,  much  needs  to  be  established 
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to  determine  if  these  ideas  are  indeed  the  case.  It 
clearly  presents  an  interesting  a  fruitful  direc¬ 
tion  for  future  research. 

To  conclude,  should  readers  wish  to  ex¬ 
amine  the  experimental  data  used  in  this  re¬ 
search,  it  can  be  obtained  (in  Sapper,  SME 
and  ACME  formats)  from  the  first  author’s 
web-site:  http://www,compapp.dcu.ie/ 
^tonyv/metaphor.htm!  A  Prolog  implemen¬ 
tation  of  the  Sapper  model  is  also  available 
from  this  location. 

REFERENCES 

T.  C.  Eskridge.  (1994).  A  hybrid  model  of  con- 
tinuous  analogical  reasoning.  In 
Branden  (ed.),  Advances  in  Conner- 
tionist  and  Neurol  Computation  Theo¬ 
ry.  Norwood,  NJ:  Ablex. 

B.  Falkenhainer,  K.  D.  Forbus,  &  D.  Centner. 
(1989).  Structure-Mapping  Engine:  Al¬ 
gorithm  and  examples.  Artificial  Intel¬ 
ligence,  41, 1-63. 

R.  Ferguson.  (1997).  Personal  Communication. 
R.  Ferguson,  K.  D.  Forbus  &  D.  Centner. 
(1997).  On  the  proper  treatment  of  noun¬ 
noun  metaphor:  A  critique  of  the  Sap¬ 
per  model.  Proceedings  of  the  Nine¬ 
teenth  Annual  Meeting  of  the  Cognitive 
Science  Society.  NJ:  Erlbaum. 

K.  D.  Forbus,  R.  Ferguson  &  D.  Centner. 
(1994).  Incremental  Structure-Mapping. 
Proceedings  of  the  Sixteenth  Annual 
Meeting  of  the  Cognitive  Science  Soci¬ 
ety.  NJ:  Erlbaum. 

K.  D.  Forbus,  D.  Centner  &  K.  Law.  (1995). 
MAC/FAC:A  model  of  similarity-based 
retrieval .  Cognitive  Science,  19, 141  -205. 
K.  D.  Forbus  D.  &  D.  Oblinger  (1990).  Making 
SME  pragmatic  and  greedy.  Proceedings 
of  the  Twelfth  Annual  Meeting  of  the  Cog¬ 
nitive  Science  Society.  Hillsdale,  NJ:  LEA. 
D.  Centner.  (1983).  Structure-Mapping:  A  the¬ 
oretical  framework  for  analogy.  Cogni¬ 
tive  Science,  1, 155-170. 

D.  Centner  &  C.  Clement.  ( 1 988).  Evidence  for 
relational  selectivity  in  the  interpretation 
of  analogy  and  metaphor.  The  Psychol¬ 


ogy  of  Learning  A  Motivation,  22.  New 
York:  Academic  Press. 

D.  Centner,  M.J.  Rattermann,  &  K.  Forbus. 
(1993).  The  roles  of  similarity  in  transfer: 
Separating  retrievahility  From  inferential 
soundness.  Cognitive  Psychology,  25. 

D.  Centner  &  C.  Toupin.  (1986).  Systcmatici- 
iy  and  surface  similarity  in  the  develop¬ 
ment  of  analogy.  Cognitive  Science,  10, 
277-300. 

R.L.  Coldstone  &  D.L.  Medin.  (1994).  Time 
course  of  comparison.  of  Exper¬ 

imental  Psychology:  Language,  Memo¬ 
ry  &  Cognition,  20,  29-50. 

D.  R.  Hofstadter&  the  Fluid  Analogy  Research 
Croup  (1995).  Fluid  Concepts  and  Cre¬ 
ative  Analogies:  Computer  Models  of 
the  Fundamental  Mechanisms  of 
Thought.  Basic  Books,  NY. 

D.  R.  Hofstadter&  R.  French  (1995).  The  Table- 
Top  system:  Perception  as  I^w-I.evel 
Analogy,  in  Fluid  Concepts  and  Creative 
Analogies:  Computer  Models  of  the  Fun¬ 
damental  Mechanists  of  Thought  (ed.  D. 
Hofstadter),  chapter  9.  Basic  Books,  NY. 

K.J.  Holyoak  &  K.  Koh  (1987).  Surface  and 
structural  similarity  in  analogical  trans¬ 
fer.  Memory  A  Cognition,  15,  332-340. 

K.  J.  Holyoak  &  P.  Thagard.  (1989).  Analogi¬ 
cal  mapping  by  constraint  satisfaction. 
Cognitive  Science,  13,  295-355. 

J.  E.  Hummel  &  K.  J.  Holyoak.  (1997).  Dis¬ 
tributed  representations  of  structure:  A 
theory  of  analogical  access  and  mapping. 
Psychological  Review. 

M.  Keane  (1985).  On  drawing  analogies  when 
solving  problems:  A  theory  and  test  of 
solution  generation  in  an  analogical 
problem  solving  British  Journal  of 
Psychology,  76,  449-458. 

M.T.  Keane  (1 987).  On  retrieving  analogues  when 
solving  proh\cm^.Quarterly  Journal  of 
Experimental  Psychology,  39A  ,  29-4 1 . 

M.  T.  Keane  &  M.  Brayshaw.  (1988).  The  In¬ 
cremental  Analogical  Machine:  A  com¬ 
putational  model  of  analogy,  fn  D.  Slee- 
man  (Ed.),  European  Working  Ses.don 
on  Learning.  Pitman,  1988. 


142 


Principle  Differences  in  Structure-Mapping 


M.  T.  Keane,  T.  Ledgeway  &  S.  Duff,  (1994). 
Constraints  on  analogical  mapping:  A 
comparison  of  three  models.  Cognitive 
Science,  18,  387-438. 

B.  N.  Kokinov  (1994).  A  hybid  model  of  rea¬ 

soning  by  analogy.  In  K.J.  Holyoak  & 
J.A.  Bamden  (Eds.)  Advances  in  Con- 
nectionist  and  Neural  Computation. 
Norwood,  NJ:  Ablex. 

A.  Markman  &  D.  Centner  (1993).  Structural 
alignment  during  similarity  comparisons. 
Cognitive  Psychology,  25, 431-467. 

A.  Ortony.  (1979).  The  role  of  similarity  in  sim¬ 
iles  and  metaphors.  In  A.  Ortony  (Ed.) 
Metaphor  and  Thought.  Cambridge, 
MA:  Cambridge  University  Press. 

R.  Ratcliffe  &  G.  McKoon  (1989).  Similarity 
information  versus  relationla  informa¬ 
tion,  Cognitive  Psychology,  2 1 , 1 39- 1 55. 
R.  Tourangeau  &  R.J.  Sternberg  (1981).  Apt¬ 
ness  in  metaphor.  Cognitive  Psycholo¬ 
gy,  U,  27-55. 

P.  Thagard.  (1997).  Personal  Communication. 

C.  J.  van  Rijsbergen.  (1979).  Information  Re¬ 


trieval.  Butterworths. 

T.  Veale  &  M.  T.  Keane.  (1993).  A  connec- 
tionist  model  of  semantic  memory  for 
metaphor  interpretation.  Workshop  on 
Neural  Architectures  and  Distributed 
AI,  19-20,  the  Center  for  Neural  Engi¬ 
neering,  U.S.C.  California. 

T.  Veale  &  M.  T.  Keane.  (1994).  Belief  modelling, 
intentionality  and  perlocution  in  metaphor 
comprehension.  Proceedings  of  the  Six¬ 
teenth  Annual  Meeting  of  the  Cognitive  Sci¬ 
ence  Society.  Hillsdale,  NJ:  Erlbaum, 

T,  Veale  &  M.  T.  Keane.  (1997).  The  compe¬ 
tence  of  sub-optimal  structure  mapping 
on  ‘hard’  analogies.  UCAI'97:  The  15^^ 
International  Joint  Conference  on  A. I. 
Morgan  Kaufmann. 

T.  Veale,  D.  O’Donoghue  &  M.  T.  Keane. 
(1996).  Computability  as  a  limiting 
cognitive  constraint:  Complexity  con¬ 
cerns  in  metaphor  comprehension, 
Cognitive  Linguistics:  Cultural,  Psy¬ 
chological  and  Typological  Issues 
(forthcoming). 


Appendix  A:  Pseudocode  of  the  Sapper  Algorithm 
Function  Sapper: :Stage-I  (T:S,  H) 

Let 

Spread  Activation  from  roots  T  and  S  in  long-term  memory  to  a  horizon  H 
When  a  wave  of  activation  from  T  meets  a  wave  from  Sat  a  bridge  V.S'  linking  a  target 
domain  concept  V  to  a  source  concept  S'  then: 

Determine  a  chain  of  relations  R  that  links  T  to  T  and  S'  to  S 
If  R  is  found,  then  the  bridge  T':S'  is  balanced  relative  to  T:S,  so  do: 

Generate  a  partial  interpretation  p  of  the  metaphor  T:S  as  follows: 
For  every  tenor  concept  t  between  T  and  T  as  linked  by  R  do 
Align  t  with  the  equivalent  concept  s  between  S'  and  S 
Letfvs) 

Let  fpf 

Return  P,  a  set  of  intermediate-level  pmaps  for  the  metaphor  T:S 
Function  Sapper: :Stage-U  (T:S,  P) 

Once  all  partial  interpretations  P  =  (pj  have  been  gathered,  do: 

Evaluate  the  quality  ( e.g.,  mapping  richness)  of  each  interpretation  p  . 

Sort  all  partial  interpretations  fpj  in  descending  order  of  quality. 

Choose  the  first  interpretation  G  as  a  seed  for  overall  interpretation. 

Work  through  every  other  pmap  p .  in  descending  order  of  quality: 
If  it  is  coherent  to  merge  p.  with  G  (i.e.,  respecting  I-to-Iness)  then: 
Let  . 

Otherwise  discard  p . 

When  (p)  is  exhausted,  Return  G,  the  Sapper  interpretation  ofT.  S 
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1.  INTRODUCTION 

A  common  reason  for  the  use  of  analogy 
in  (computational)  problem  solving  is  the  lack 
of  appropriate  object-level  knowledge,  c.g. 
rules,  necessary  to  solve  the  problem  from  first 
principles.  Hence,  the  absence  of  sufficient  (ob¬ 
ject-level)  domain  knowledge  is  assumed  in 
most  case-based  reasoning  (CBR)  systems. 
Even  those  CBR  systems  that  combine  rule- 
based  and  case-based  reasoning  rely  on  a  simi¬ 
lar  assumption:  if  rules  exist,  then  reason  from 
first  principles,  otherwise  use  case-based  rea¬ 
soning  [17,18].  That  is,  the  use  of  analogy  as  a 
search  control  strategy  by  transferring  control 
knowledge,  is  hardly  an  issue  in  CBR  research, 
except  in  case-based  planning  (CBP). 

As  far  as  we  know,  the  situation  is  similar  in 
cognitive  research  on  analogy.  Why  this?  One 
reason  might  be  that  more  often  than  not  the 
problems  chosen  for  cognitive  experiments  have 
single-step  solutions  rather  than  solutions  with 
many  steps  as  in  planning  and  hence,  search  con¬ 
trol  does  not  matter  much.  For  instance,  the  much 
investigated/standard  problems  “atom/solar  sys¬ 
tem",  "water  flow/  heat  flow",  and  Duncker’s 
radiation  problem  do  not  require  a  search-inten¬ 
sive  multi-step  solution  process. 

As  opposed  to  solutions  of  these  problems, 
Newtonian  physics  problem  solving  [20]  and 
especially  mathematical  theorem  proving  need 
a  complex  multi-step  problem  solving  process, 
where  search  control  is  a  central  issue.  The  same 
is  true  for  many  computational  planning  prob¬ 
lems.  The  problems  to  be  solved  by  CBP  may 
have  complex  and  multi-step  solutions,  c.g.,  in 


mathematical  theorem  proving  f  1 2].  Therefore, 
CBP  aims  at  reducing  the  search  effort  for  find¬ 
ing  a  solution  [5, 22,  1, 1 1]. 

This  paper  is  centered  around  our  experi¬ 
ences  with  problem  solving  for  complex  solu¬ 
tions  that  have  multiple  steps,  where  decisions 
as  to  which  sequences  of  steps  to  explore  are 
crucial.  Here,  problem  solving  by  analogy  can 
have  the  following  purposes: 

Computational  analogy  tries  to  improve  the 
exploitation  of  limited  resources,  in  particular 
of  the  number  of  user  interactions,  run  time, 
and  of  knowledge.  Hence,  the  purpose  of  anal¬ 
ogy  can  be,  cf.  [10]  to  save  user  interaction 
(which  is  a  replacement  for  control  knowledge 
in  interactive  systems);  to  use  analogy  to  re¬ 
place  search-intensive  subroutines  at  low  cost. 

Similarly,  for  human  problem  solving  by 
analogy  Van  Lehn  and  Jones  [20]  suggest  that 
at  least  good  human  problem  solvers  use  ana¬ 
logical  problem  solving: 

•  when  no  general  (object-level)  knowledge 
physics  principles  such  as  the  force  law, 
Newton’s  law,  and  mathematical  transfor¬ 
mations  for  solving  a  current  problem  is 
available,  c.g.,  if  a  knowledge  gap  has  to 
be  detected  and  filled;  For  instance,  sub¬ 
jects  detected  a  force  that  was  missing  in  a 
diagram  by  checking  a  previous  solution. 
Delecting  a  gap  means  to  discover  that 
some  principle  is  missing  for  a  problem  to 
be  solved. 

•  when  specific  information  from  an  exam¬ 
ple  can  be  used  in  order  to  work  more 
efficiently  D  in  other  words  to  save 
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search. ,  e.g.,  for  the  explication  of  phys¬ 
ics  quantities. 

Put  differently,  the  computational  experi¬ 
ence  and  the  described  cognitive  results  sug¬ 
gest  that  the  task  of  analogical  transfer  of  multi- 
step  problem  solving,  requires  to  (1)  transfer 
object-level  knowledge  and/or  (2)  control 
knowledge,  that  is,  decisions  on  the  choice  of 
steps,  instantiations,  etc.  As  for  the  second,  the 
decisions  may  well  depend  on  the  problem  solv¬ 
ing  context.  Therefore,  the  transfer  of  control 
knowledge  requires: 

•  to  check  whether  the  target  context  justi¬ 
fies  a  decision  as  the  source  context  did. 
This  check  has  to  be  performed  immedi¬ 
ately  before  each  step  transfer  because  each 
step  in  a  solution  process  builds  on  results 
of  earlier  steps  and  hence  a  whole  trans¬ 
ferred  solution  may  be  invalidated  by  the 
failure  of  an  intermediate  step  and  a  sim¬ 
ple  modification  of  this  failed  step  alone 
cannot  guarantee  to  yield  a  valid  solution; 

•  to  actually  replay  source  decisions  in  the 
target.  These  decisions  may  differ  consid¬ 
erably  from  the  actual  solution  steps,  e.g., 
the  decisions  may  concern  abstract  steps 
that  can  yield  different  results  when  exe¬ 
cuted  in  different  situations. 

LI  Contribution  of  the  Paper 

As  explained  in  §2,  derivational  analogy 
is  a  computational  answer  to  the  described 
needs  of  transferring  control  knowledge  in  an¬ 
alogical  problem  solving.  In  that  section  we 
discuss  our  experiences  with  derivational  anal¬ 
ogy  in  a  transportation  planning  domain  and  in 
mathematical  proof  planning.  Furthermore,  sec¬ 
tion  2.2  explains  the  transfer  of  object-level 
knowledge  by  reformulation  that  can  be  com¬ 
bined  with  derivational  analogy.  Section  2.2.1 
discusses  some  advantages  of  derivational  anal¬ 
ogy  compared  to  the  pure  transformational  ap¬ 
proach  assumed  in  most  cognitive  models. 

Then  we  address  the  question  whether  com¬ 
putational  derivational  analogy  can  model  hu¬ 
man  analogical  transfer  of  multi-step  solutions 
under  certain  conditions.  We  suggest  some 


questions  to  be  addressed  empirically,  e.g.,  ‘Do 
characteristics  from  computational  derivation¬ 
al  analogy  transfer  to  the  spontaneous  or  guid¬ 
ed  use  of  analogical  problem  solving?’  In  par¬ 
ticular,  we  suggest  questions  whose  empirical 
answer  can  contribute  to  a  well-founded  sup¬ 
port  of  analogical  problem  solving,  say  in  teach¬ 
ing  and  assistant  systems. 

Our  expertise  is  in  computational  analogy. 
Therefore  our  questions  and  suggestions  should 
be  considered  a  mere  proposal  for  further  cog¬ 
nitive  and  multidisciplinary  research. 

2.  DERIVATIONAL  ANALOGY 

Derivational  analogy  introduced  in  [6]  de¬ 
notes  a  process  that  draws  analogies  from  the 
experiences  of  the  past  reasoning  process.  The 
underlying  key  insight  is  that  useful  experience 
is  encoded  in  the  reasoning  process  used  to  de¬ 
rive  solutions  to  similar  problems,  rather  than 
just  in  the  final  solution.  Therefore,  derivational 
analogy  is  a  reconstructive  method  by  which 
lines  of  reasoning,  i.e.,  of  search  control,  are 
transferred  and  adapted  to  a  new  problem  as 
opposed  to  transformational  analogy  that  adapts 
the  final  solutions. 

The  derivational  analogy  framework  has 
been  instantiated  by  several  computational  sys¬ 
tems,  including  BOGART  [14],  REMAID  [3], 
PRIAR  [7],  APU  [2],  Prodigy/Analogy  [23], 
and  AB ALONE  [13].  These  systems  apply  to  a 
variety  of  multi-step  problem  solving  activities, 
including  software  reuse  in  a  UNIX  program¬ 
ming  domain  [2],  the  design  of  human  com¬ 
puter  interfaces  [3],  and  several  planning  ap¬ 
plications  [7,  22,  13]. 

The  case-based  planning  is  built  on  top  of 
a  generative  planner  that  generates  the  source 
plans  consisting  of  operators  reducing  a  goal  to 
subgoals.  Typically,  this  generative  planning 
involves  a  lot  of  search  because  several  opera¬ 
tors  are  applicable  to  a  goal.  Case-based  plan¬ 
ning  the  choice  of  operators  rather  than 

searching  for  them. 

If  possible,  the  derivational  analogy  replays 
the  choice  of  operators  in  a  source  plan  step  by 
step.  If  the  justification  of  a  particular  choice 
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input:  source  plan,  source  and  target  problem 
output:  (partial)  target  plan  Map  source  and  target. 
Map  source  and  target, 
while  source  plan  not  exhausted  do 

Get  next  operator  M  from  source  plan. 

Cheek  M’s  justifications. 

If  justifications  hold,  then  transfer  M  to  target 
and  advance  source, 
else  choose  suitfible  action. 

Base-level  plan  for  remaining  open  goals. _ 

Table  I.  Top-f^vel  Algorithm  of  Derivational  CBP. 

does  not  hold  in  the  target,  then  it  may  be  pos¬ 
sible  to  carry  out  some  reaction.  As  the  outline 
in  Table  1  shows,  the  implementations  of  deri¬ 
vational  case-based  planning  have  three  main 
components,  the  retrieval  including  the  map¬ 
ping  from  source  to  target,  the  check  of  justifi¬ 
cations  for  a  source  decision  in  the  target  and 
the  actual  replay  in  case  the  justification  holds 
or  can  be  established.  This  analogy  permits  a 
partial  transfer  of  solutions  when  a  total  trans¬ 
fer  cannot  be  justified. 

In  order  to  check  of  justifications  during  the 
analogical  replay,  these  justifications  have  to  be 
stored  and  indexed.  Automatic  generation  of  the 
derivational  planning  episodes  occurs  by  extend¬ 
ing  the  base-level  generative  planner  with  the 
ability  to  examine  its  internal  decision  cycle, 
recording  the  justifications,  i.c.,  reasons  why  an 
operator  was  chosen,  for  each  decision  during 
its  search  process.  Veloso  [22]  discusses  the 
importance  of  choosing  relevant  justifications 
and  of  providing  a  language  for  justifications: 
The  stored  information  should  be  directly  avail¬ 
able  during  the  generative  planning  and  relevant 
information  should  be  stored  only. 

2.1  Analogy  in  Complex  Planning  Domains 

Planning  systems  in  Artificial  Intelligence 
fall  into  two  general  categories: 

1 .  Hierarchical  “top-down”  planners,  such  as 
SIPE  [25],  which  can  solve  relatively  com¬ 
plex  problems  but  require  significant 
knowledge  engineering  of  each  new  do¬ 
main,  and  also  exhibit  somewhat  rigid  plan¬ 
ning  behavior. 


2.  Operator-based  “bottom-up”  planners,  such 
as  PRODIGY  [24],  which  often  require 
massive  search  to  solve  complex  problems, 
but  make  do  with  simpler  knowledge  engi¬ 
neering  and  exhibit  more  robust  behavior, 
including  the  production  of  different  con¬ 
tingency  plans. 

In  order  to  combine  the  best  features  of  both 
paradigms,  the  Prodigy  project  has  integrated 
non-linear  operator-based  planning  with  multi¬ 
ple  types  of  learning,  including  control-rule 
learning,  lepresentation -change  learning,  ab.strac- 
tion-hierarchy  learning,  and  derivational  analo¬ 
gy.  Learning  provides  search  guidance  and  makes 
more  complex  problems  tractable,  while  retain¬ 
ing  the  underlying  flexibility  of  the  operator- 
based  planner  if  necessary  D  i.e.  when  previous¬ 
ly  acquird  knowledge  proves  insufficient  in  solv¬ 
ing  a  novel  problem.  Derivational  analogy  has 
proven  particularly  useful  in  this  regard  [23]. 

Among  several  application  domains,  Prod¬ 
igy  was  used  to  produce  plans  that  solve  trans¬ 
portation/logistics  problems  whose  solution 
may  require  several  hundreds  individual  steps. 
The  transportation  domain  involves  moving 
multiple  sets  of  objects  through  an  inter-city 
transportation  network  relying  on  different  ve¬ 
hicles  (trucks,  airplanes),  with  preference  for 
lower-cost  solutions  [26].  Prior  to  attempting 
complex  problems,  Prodigy  was  trained  with 
simple  problems,  then  increasingly  more  com¬ 
plex  ones,  which  led  to  the  creation  of  a  1000- 
case  library  (22).  Rather  than  delving  into  the 
details  previously  reported  in  the  literature,  let 
us  focus  on  the  lessons  learned: 

•  Control  Knowledge  is  Crucial  D  In  the¬ 
ory,  all  well-defined  transportation  prob¬ 
lems  can  be  solved  or  proven  unsolvable 
by  the  first-principles  planner,  But,  in 
practice,  base-line  Prodigy  would  require 
search  spaces  several  orders  of  magnitude 
larger  than  its  maximum  capabilities  to 
solve  200-stcp  non-linearly-decompos- 
able  transportation  problems.  Hence, 
Analogy  expands  the  solvability  horizon 
of  a  planner  just  by  supplying  much- 
needed  control  knowledge. 
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•  Reasoning  with  Justifications  is  Crucial 

-  Pure  transformational  CBR  does  not 
check  justifications.  These  are  crucial, 
however,  to  guarantee  the  soundness  of 
the  retrieved  analog  plans  for  the  current 
problem  -  or  to  repair  the  plan  if  the  jus¬ 
tifications  fail.  Derivational  analogy 
works  because  all  plans  are  equally  reli¬ 
able  D  there  is  no  tradeoff  between  care¬ 
ful  reasoning  and  risky  memory  lookup, 
as  justification  checking  eliminates  the 
risk  of  inapplicability. 

•  Interleaving  Analogical  Rederivation  with 
First-Principles  Planning  is  Crucial  - 
Many  complex  problems  can  be  partially 
but  not  fully  solved  by  rederivation  of  past 
cases.  For  instance,  a  particular  road  used 
before  may  be  closed,  or  all  airplanes  in  a 
particular  city  may  be  grounded  by  fog. 
Or,  simply,  the  problem  places  some  new 
demands  not  previously  encountered.  Jus¬ 
tification  failure  is  an  invitation  to  reason 
from  first  principles  either  to  re-establish 
the  failed  justification  (e.g.  wait  for  the 
fog  to  clear),  or  to  keep  the  bulk  of  the 
plan  and  modify  the  failed  part  (e.g.  keep 
the  same  route,  but  detour  around  the 
closed  segment). 

•  Interleaving  Multiple  Cases  is  Very  Useful 

-  Most  often,  past  cases  solve  parts  of  the 
new  problem,  and  several  must  be  com¬ 
posed,  with  occasional  gaps  filled  in  by 
first-principles  planning,  in  order  to  solve 
increasingly  complex  problems. 

•  Derivational  Analogy  Does  Not  Sacrifice 
Plan  Efficiency  -  Derivational  analogy  plans 
efficiently,  but  does  it  produce  efficient 
plans?  This  is  a  legitimate  question  best  an¬ 
swered  empirically,  since  neither  first-prin¬ 
ciples  nor  analogical  planning  guarantees 
optimality.  Test  showed  equivalent  plan  ex¬ 
ecution  cost  on  average  for  the  transportation 
domain.  Explicit  learning  of  plan-efficiency 
control  rules,  however  can  help  both  base- 
level  and  derivational  analogy  planning  pro¬ 
duce  plans  that  minimize  execution  cost  [26]. 


•  Knowledge  Revision  is  an  Unresolved  Issue 
-  An  unresolved  issue  is  how  to  modify  a 
large  analogical  case  library  if  the  domain 
knowledge  changes  significantly.  For  in¬ 
stance,  if  a  new  mode  of  transportation  is 
invented  replacing  trucks  (as  the  latter  re¬ 
placed  horse-drawn  carts),  past  plans  be¬ 
come  obsolete.  However,  if  smaller,  more 
subtle  changes  occur  (e.g.  a  new  speed  limit 
is  enacted),  it  should  prove  feasible  to  sal¬ 
vage  the  plan  library.  Whereas  this  issue  re¬ 
mains  unresolved,  some  domains  such  as 
theorem  proving  (discussed  below)  need  not 
worry  about  the  underlying  mathematical 
knowledge  ever  reaching  obsolescence. 

2.2  Analogy  in  Planning  Proofs  of  Mathe¬ 
matical  Theorems 

Proof  planning  is  a  methodology  for  au¬ 
tomated  theorem  proving  that  constructs  a 
proof  by  search  at  the  abstract  level  of  proof 
plans  [4].  On  top  of  a  proof  planner,  analo¬ 
gy-driven  proof  plan  construction  [9]  yields 
a  (partial)  proof  plan  that  may  be  expanded 
to  a  proof  Analogy-driven  proof  plan  con¬ 
struction  is  an  extension  of  the  general  deri¬ 
vational  CBP  because  it  extends  the  mapping 
to  a  second-order  mapping,  new  kinds  of  jus¬ 
tifications,  described  below,  extend  those  in 
simpler  planning  domains,  and  because  it  in¬ 
cludes  reformulations  of  the  source  plan  as 
shown  in  Table  2. 

Sometimes  a  step  by  step  replay  will  not 
be  enough.  In  this  case,  the  source  plan  may 
be  reformulated  before  the  replay.  Reformu¬ 
lations  can  insert,  change,  or  delete  source 
operators.  They  map  proof  plans  in  a  way 
based  on  differences  between  the  source  and 
target  problems,  i.e.,  they  are  triggered  by  pe¬ 
culiarities  in  the  second-order  mapping.  For 
instance,  the  reformulation  lto2  is  triggered 
when  there  is  a  C-equation  f.=f  (see  below) 
and  the  mapping  m^from  source  to  target  vio¬ 
lates  the  equation  as  follows 
me(fi)(me(fi))=me(fj)  .  lto2  changes  a  one- 
step  induction  in  the  source  to  a  two-step  in 
the  target.  In  addition,  it  doubles  certain  oper¬ 
ators  in  the  target  plan,  v 
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input:  source  plan,  source  theorem  and  assumptions,  target  theorem  and  assumptions 

output:  (partial)  target  plan _ _ 

Second-order  map  source  and  target  triggers  reformulations  of  the  source  plan, 
source  plan  f-  reformulated  source  plan, 
while  source  plan  not  exhausted  do 
Get  next  operator  M  from  source  plan. 

Check  M*s  justifications. 

If  justifications  hold,  then  transfer  M  to  target 
and  advance  source, 
else  choose  suitable  action. 

Plan  from  first  principles  for  remaining  open  goals. _ 


Table  2.  Outline  of  Analogy-Driven  ^oof  Plan  Construction. 


Since  an  operator  such  as  induction  com¬ 
putes  its  outputs,  the  actual  subproof  that  is  rep¬ 
resented  by  the  operator  may  vary  between  so¬ 
lution.  Hence  operators  are  abstract  entities  in 
the  solution  and  an  analogical  replay  requires 
to  actually  apply  a  chosen  operator  in  the  tar¬ 
get  in  order  to  get  the  correct  output. 

Now  we  present  some  justifications  and 
explain  the  reaction  to  failed  justifications  with 
the  following  example  where  the  source  prob¬ 
lem  is  a  theorem  and  lemmata  about  lists. 

The  source  proof  plan  and  has  operators 
such  as  induction,  elementary  for  trivial  sub¬ 
proofs,  and  wave  which  we  won't  explain  here. 
(Note,  however,  that  operators  such  as  induc¬ 
tion  or  elementary  may  produce  different  sub¬ 
proofs  in  different  situations.)  The  target  prob¬ 
lem  is  one  about  natural  numbers: 
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Figure  I.  Proof  Plan  of  the  Theorem  lennapp. 


Source  Theorem  (lennapp) : 

length  iapptB,h)  *  lengthtapp(b,a) ) 

Lemmata : 

app2 :  apptconstX,Y),2}  =>  consfX,  appfY.Z}) 

len2  :  tengrhfronrfX.Y))  *>  xflrnpthfY)) 

lenapp2  ;  tenf!th(app(X,  cpnrfY.Z}})  g>  x(lrni^fh(app(X.7.W 


Target  Theorem  (half plus) 

ha!f[‘i-(a,b))  =  haffl+(h,a)) 

Lemmata: 

(plus2)  ;  +(s(y),Z)  s(+(y,Z)) 

(half 3 )  ;  haf/(s(s(y)))  =>  sfhn/ffy)). 


Justifications  Stored  in  the  Source  Plan 

The  analogy  system  ABALONE  is  imple¬ 
mented  on  top  of  the  proof  planner  CL^M  that 
stores  two  new  kinds  of  justifications: 

•  Legal  conditions  on  the  context  for  the  ap¬ 
plication  of  operators,  such  as  the  existence 
of  a  lemma  that  is  necessary  to  apply  the 
wave  operator. 

•  Constraints  on  the  objects  (e.g.,  the  function 
symbols)  that  are  required  for  the  source 
solution,  in  particular  the  identity  of  differ¬ 
ent  occurrences  of  a  function  symbol. 
Since  ABALONE  is  able  to  send  a  func¬ 
tion  symbol  at  different  positions  in  the  source 
to  different  target  images,  source  function  sym¬ 
bols  at  different  positions  are  differentiated  by 
indices.  Then  the  source  problem  becomes 
(only  some  indices  are  shown  for  simplicity). 
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Source  Theorem  (lennapp)  : 

length,  (appj  (a,  b)  *  length,  (app,(b,  a) ) 

Lemmata ; 

app2  :  api^consi(X,Y),Z)  =>  cons^X,  app(Y,Z)) 

len2 ;  length^conx^X.Y))  =>  SjOength/Y)) 

lenapp2  ;  length/ap]}(X.  cons/Y.Z)))  ->  Sjflengtht^app(X,Z))) 

During  the  source  planning,  constraints 
may  be  placed  on  these  indices,  yielding 
C(onstraint)-equations,  of  the  form  fi  =  fj  in 
the  source  plan.  These  C-equations  form  an 
additional  justification  that  must  be  satisfied 
in  the  target  for  a  successful  replay.  The  fol¬ 
lowing  C-equations  emerge  from  the  source 
planning  process  for  the  source  problem  lena- 
pp:  cons5=consl ,  cons2=cons3 ,  cons  5  =  cons 
4,  sl=s2  ,  where  cons5  is  introduced  by  the  in¬ 
duction  operator. 

We  have  to  consider  the  mappings  found 
for  the  example  in  order  to  understand  how  they 
violate  justifications.  The  second-order  basic 
mapping  mb  for  the  theorems  is:  length!  half , 
and  appi  +  (for  all  i ).  mb  is  extended  to  a  map¬ 
ping  me  that  maps  the  source  and  target  lem¬ 
mata.  For  instance,  lemma  app2  is  mapped  in 
the  following  way: 

app(consl(X,Y),  Z)  :^app(cons2(X,Y),Z) 
(app2)”mb:  <>+(consl(X,Y),Z)  =»  cons2(X, 
+(Y,Z))  +(s(Y),Z)  =>s(+(Y,Z))  (plus2)  me 
(const  ,2)  =  Iw  1  .lw2.  s(w2) 

Since  mb(app)=+ ,  lemma  app2  can  be  par¬ 
tially  mapped  to  +(consl(X,Y),Z)^cons2(X, 
+(Y,Z))  .  The  mapping  is  completed  by  map¬ 
ping  the  source  lemma  to  an  available  target  lem¬ 
ma.  In  riiis  way,  app2  maps  to  plus2  with  consl 
lwllw2 .  s(w2)  and  cons2  lwllw2 .  s(w2) .  Sim¬ 
ilarly,  len2  maps  to  half3  because  of  mb(length) 
=  half  ,  giving  me(sl)=lwl.  s(wl)  and 
me(cons3)=  Iwl  .w2.  s(s(w2)) .  Note  that  the  lat¬ 
ter  violates  the  C-equation  cons3=cons2  ,  be¬ 
cause  cons2,  cons3  have  different  target  images. 

2.2.1  Reaction  to  Failed  Justifications 

If  the  check  of  a  justification  fails  during 
AB ALONE’ s  analogical  replay,  certain  reac¬ 
tions  to  failed  justifications  try  to  make  an  op¬ 
erator  applicable  anyway,  for  instance: 

1.  If  a  justification  that  requires  the  existence 
of  a  certain  lemma  does  not  hold  in  the  tar¬ 
get,  i.e.,  if  a  target  lemma  corresponding  to 


a  certain  source  lemma  cannot  be  found,  then 
ABALONE  speculates  a  target  lemma. 

2.  If  a  C-equation  is  violated,  then  a  reformu¬ 
lation  is  applied  under  certain  conditions. 
In  the  example  a  violation  of  the  C-equa¬ 
tion  cons3=cons2  occurs  because  me(cons3)= 
me(cons2)  and  this  triggers  a  lto2  reformula¬ 
tion  which  duplicates  the  operator  wave(app2) 
such  that  the  resulting  target  plan  contains  two 
operators  wave(plus2). 

The  first  kind  of  failure  occurs  in  the  ex¬ 
ample  since  the  source  lemma lenapp2  does  not 
have  an  image  in  the  target  because  it  cannot 
be  mapped  to  plus2  or  half3  by  extending  mb  . 
The  appropriate  reaction  is  to  speculate  a  tar¬ 
get  lemma.  ABALONE  uses  the  mappings  and 
C-equations  s2=sl  with  the  mapping 
me(sl)=lwl  .s(wl)  s(wl)  ,  and  cons4=cons5 
with  the  mapping  me(cons5)=lwl  lw2 .  s(s(w2)) 
to  come  up  with  the  target  lemma: 

half(+(X,  s(s(Z))))  =>s(half(-H(X,Z)))  as  an 
image  oflength(app(x,  cons4  (y,z)))  =»s2 
(length(app(x,z))) . 

2.2.2  Summary 

Derivational  analogy  is  needed  because  the 
replaying  an  (abstract)  decision  in  a  certain  situ¬ 
ation  may  result  in  a  concrete  solution  that  can¬ 
not  be  obtained  by  simply  transferring  steps  (e.g., 
different  logical  proofs  produced  by  running  the 
elementary  operator  in  different  situations). 

Justifications  are  crucial  since  they  can 
they  guarantee  the  soundness  of  steps  chosen 
by  analogy  for  a  target  problem. 

Reasoning  about  justifications  is  crucial 
because  this  allows  to  derive  reactions  to  fail¬ 
ing  justifications  in  the  target,  even  depending 
on  the  available  resources. 

Justifications  may  serve  as  explanations  in 
proofs  presented  to  a  user. 

2.2.3  Advantages  of  Derivational 
Analogy 

Carbonell  [6]  discusses  an  example  illustrates 
an  advantage  of  derivational  analogy:  Suppose 
you  have  coded  a  quicksort  routine  in  Pascal,  and 
then  you  are  asked  to  recode  the  routine  in  LISP. 
Although  the  problem-solving  process  may  pre¬ 
serve  much  of  the  inherent  similarity,  the  result- 
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ant  solutions  may  be  hardly  similar  A  line-by¬ 
line  program  transfer  is  clearly  not  appropriate, 
but  a  reuse  of  major  structural  and  control  deci¬ 
sions  required  to  construct  the  Pascal  program  is 
possible.  Therefore,  the  analogy  must  be  guided 
by  a  reconsideration  of  the  key  decisions  in  light 
of  the  new  situation.  In  particular,  the  derivation 
of  the  LISP  quicksort  program  starts  from  the  same 
specification,  retaining  the  same  divide-and-con- 
quer  strategy,  but  it  may  diverge  in  the  selection 
of  data  structures  (list  vs.arrays)  or  in  the  method 
of  choosing  the  comparison  element.  However, 
future  decisions  that  do  not  depend  on  earlier  di¬ 
vergent  decisions  can  still  be  tran!!sferred  to  the 
new  domain  rather  than  recomputed. 

Similarly,  in  proof  planning  several  opera¬ 
tors  represent  an  abstraction  of  the  actual  sub¬ 
proof  they  produce.  For  instance,  an  application 
of  the  operator  induction  involves  computing  an 
induction  schema,  the  induction  variables,  the 
base  case  and  the  step  case  subgoal  of  the  proof. 
For  instance,  elementary  can  produce  different 
proofs  when  executed.  Thus  the  replay  of  a  proof 
plan  in  different  situations  can  result  in  differ¬ 
ent  proofs  and  different  subgoals  although  ab¬ 
stractly  the  source  proof  equals  the  target  proofs. 

From  the  above  examples  other  advantag¬ 
es  of  derivational  analogy  can  be  summarized. 

3.  INTERESTING  QUESTIONS 

The  above  description  of  analogical  sejirch 
control  suggests  the  question  ‘How  does  all  this 
apply  to  human  analogical  problem  solving?’ 
which  implies  many  more  specific  questions 
to  cognitive  research; 

1.  Can  justifications/derivational  information 
be  found  in  spontaneous  human  analogy? 

2.  Is  storing  derivational  information  p.sycho- 
logically  implausible  because  of  the  limit¬ 
ed  working  memory  as  proposed  in  [16]  ? 
Is  it  necessary,  as  suggested  by  Reimann, 
to  store  as  much  as  possible  from  a  prob¬ 
lem  solving  episode? 

3.  What  arc  relevant  justifications  in  human  prob¬ 
lem  solving?  Arc  they  domain-dependent? 


4.  Docs  memorizing  relevant  justifications 
depend  on  expertise  and  on  the  ability  of 
self-explanation. 

5.  How  do  expert  self-explanation  and  extract¬ 
ing  justifications  from  a  problem  solving 
process  relate? 

6.  What  is  the  impact  of  carefully  chosen  der¬ 
ivational  information  on  analogical  trans¬ 
fer  and  adaptation  performance? 

7.  Can  adaptation  schemas  be  found  in  hu¬ 
man  analogical  problem  solving?  How  do 
they  compare  with  reformulations  triggered 
by  failed  justifications? 

8.  Can  context,  as  addressed  in  [8],  be  mod¬ 
elled  by  derivational  information? 

9.  How  do  explanations  as  addressed  in  [  1 5] 
compare  with  justifications? 

10.  Which  experimental  techniques  can  (near¬ 
ly)  exclude  mental  reference  to  derivation¬ 
al  information  that  cannot  be  observed  as 
opposed  to  explicit  reference? 

1 1 .  For  research  that  cares  about  supporting  an¬ 
alogical  reasoning,  for  instance  for  tutor 
systems,  the  following  questions  may  be 
particularly  interesting. 

1 2.  What  is  the  influence  of  externally  provid¬ 
ed  derivational  information  on  performance 
and  correctness  of  human  analogical  prob¬ 
lem  solving?  Hence,  which  information 
should  be  provided  in  teaching  and  tutor 
systems  to  support  the  analogical  problem 
solving? 

1 3.  Docs  derivational  information  support  peo¬ 
ple  in  noticing  analogies? 

14.  Does  derivational  information  create  self- 
explanations? 

15.  Does  derivational  information  support 
learning  from  analogies? 

3A  Related  Work 

Van  Lehn  [  1 9]  suggests  that  a  solver  who  ‘un¬ 
derstands  how  an  example’s  result  is  derived  can 
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adapt  it  more  intelligently  to  the  target  problem. 
Thus,  one  would  expect  the  Good  solvers  to  use 
derivational  analogy  more  frequently  than  non- 
derivational  analogy  and  Poor  solvers  should  use 
non-derivational  analogy  more  than  derivational 
analogy.*  To  check  this  prediction,  Van  Lehn  an¬ 
alyzed  transfer  events  in  Newtonian  physics  learn¬ 
ing  to  see  if  the  student  explained  the  example 
before  transferring  it.  VanLehn  concludes  that 
self-explaining  the  example  during  analogical 
problem  solving  is  not  particularly  common.  We 
think  that  the  experiments  could  be  varied,  how¬ 
ever,  by  providing  (written)  explanations  in  the 
source  problem  solving  and  by  experimenting 
with  more  difficult  multi-step  solutions  where 
derivational  analogy  might  be  necessary. 

Van  Lehn  models  some  analogical  search 
control  in  Cascade  [21].  It  stores  triples  consist¬ 
ing  of  the  problem,  the  goal,  and  the  rule  to 
achieve  the  goal.  Whenever  faced  with  a  search 
control  decision.  Cascade  decides  by  analogy. 
Thereby  Cascade  could  learn  rules  that  it  could 
not  otherwise  learn.  Analogical  search  control 
modeled  the  intuition  that  students  learn  more 
than  just  physics  rules  from  studying  the  exam¬ 
ples  because  they  also  learn  “how  to”  knowl¬ 
edge.  In  experiments  Cascade’s  analogical  search 
control  did  not  match  well  with  the  protocols.  In 
the  opposite,  a  default  ordering  of  rules  plus  few 
general  search  heuristics  did  sufficiently  explain 
the  subject’s  behavior.  We  think  that(l)  the  lat¬ 
ter  expiation  should  be  checked  with  more 
complicated  solutions  for  which  rating  the 
steps  is  far  from  sufficient,  e.g.,  in  proof  plan¬ 
ning.  (2)  Instead  of  always  deciding  by  analo¬ 
gy,  we  would  expect  analogical  search  con¬ 
trol  only  in  case  the  search  space  is  la!!rge, 
i.e.,  many  alternative  decisions  are  possible. 

Reimann  [16]  discusses  that  derivational 
analogy  is  a  normative  model  for  high-quality 
analogical  problem  solving.  He  thinks  of  it  as 
implausible  though. 

4.  CONCLUSIONS 

Based  on  our  experience  in  computational 
analogy,  we  pointed  to  characteristics  and  ad¬ 
vantages  of  derivational  analogy  in  problem 


solving.  We  discuss  case-based  planning  for 
problems  of  the  transportation  domain  and  of 
mathematical  proof  planning.  As  opposed  to 
transformational  analogy,  derivational  analogy 
provides  analogical  search  control  based  on 
justifications  for  decisions.  The  choice  and  de¬ 
sign  of  the  justifications  is  of  great  importance 
to  the  computational  analogy  systems.  Does  this 
hold  for  human  solvers  too? 

,  The  derived  questions  and  suggestions  pro¬ 
pose  further  cognitive  and  multidisciplinay  re¬ 
search,  in  particular,  for  supporting  analogical 
reasoning  on  complex  problems.  Vice  versa, 
cognitive  empirical  results  are  essential  in  or¬ 
der  to  acquire  and  represent  the  right  knowl¬ 
edge  in  computational  systems  that  are  sup¬ 
posed  to  model  or  to  support  human  analogical 
problem  solving,  e.g.  in  a  proof  planner. 
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ABSTRACT 

Holographic  Reduced  Representations 
(HRRs)  are  a  method  for  encoding  nested  rela¬ 
tional  structures  in  fixed  width  vector  represen¬ 
tations.  HRRs  encode  relational  structures  as 
vector  representations  in  such  a  way  that  the 
superficial  similarity  of  the  vectors  reflects  both 
superficial  and  structural  similarity  of  the  rela¬ 
tional  structures.  HRRs  support  a  number  of 
operations  that  could  be  very  useful  in  models 
of  analogy  processing:  fast  estimation  of  su¬ 
perficial  and  structural  similarity  via  a  vector 
dot-product;  chunking  of  vector  representa¬ 
tions;  and  finding  corresponding  objects  in  two 
structures. 

1.  INTRODUCTION 

Vector  representations  are  popular  for  mem¬ 
ory  models  for  a  variety  of  theoretical  and  practi¬ 
cal  reasons.  They  are  simple  and  support  fast  par¬ 
allel  processing  such  comparison  via  dot-prod¬ 
ucts.  They  are  also  neurologically  plausible,  in 
that  they  can  be  stored  and  processed  in  networks 
of  simple  neuron-like  processing  elements,  such 
as  associative  vector  memories.  However,  their 
use  in  models  of  analogy  processing  has  been  lim¬ 
ited  by  the  widespread  supposition  that  it  is  diffi¬ 
cult  or  impossible  to  encode  compositional  struc¬ 
ture  in  vector  representations  (Fodor  and  Pyly- 
shyn ,  1 988,  Ratcliff  and  McKoon,  1 989,  Thagard, 
Holyoak,  Nelson  and  Gochfeld,  1990,  Centner  and 
Markman,  1 993,  Forbus,  Centner  and  Law,  1994, 
Wharton  et  a!  1994). 

This  supposition  is  false.  Structure  can  be 
represented  in  vectors  in  a  number  of  ways,  e.g., 
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Smolensky's  (1990)  tensor  products.  Pollack's 
(1990)  RAAMs,  Kanerva's  ( 1 996)  binary  spat- 
tercodes,  and  Plate's  (1995)  HRRs.  This  paper 
describes  HRRs  and  makes  a  riUmberof  claims 
for  their  usefulness  in  models  of  analogy  re¬ 
trieval  and  processing: 

•  HRRs  provide  an  adequate  vector-based 
representation  of  structure  (in  contrast  to 
feature-vector  approaches,  which  must  be 
complemented  with  a  conventional  sym¬ 
bolic  representation  for  structure). 

•  Estimates  of  similarity  that  reflect  both  su¬ 
perficial  and  structural  similarity  can  be 
computed  quickly  via  vector  dot-products. 
This  technique  shows  similar  abilities  and 
limitations  with  respect  to  detecting  simi¬ 
larities  as  are  observed  in  people's  ability 
to  retrieve  items  from  long  term  memory. 

•  Corresponding  objects  in  two  analogical 
structures  can  be  found  via  fast  but  approx¬ 
imate  vector-based  techniques. 

•  HRRs  provide  an  elegant  implementation 
of  chunking  and  “pointers”  for  complex, 
structured  items  stored  as  vectors  in  a  con¬ 
tent  addressable  memory. 

2.  ANALOGY  PROCESSING  IN 
PEOPLE 

Analog  retrieval  and  mapping  have  re¬ 
ceived  a  significant  amount  of  attention  in  the 
psychological  literature.  Much  attention  has 
been  devoted  to  teasing  apart  the  differing  ef¬ 
fects  of  superficial  and  structural  similarity  in 
retrieval  and  mapping. 
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For  illustrations,  the  following  series  of  ep¬ 
isodes  are  used  in  this  paper.  Together  the  epi¬ 
sodes  involve  dogs  (Fido,  Spot  and  Rover),  peo¬ 
ple  (Jane,  John  and  Fred),  a  cat  (Felix)  and  a 
mouse  (Mort).  Members  of  one  species  are  as¬ 
sumed  to  be  similar  to  each  other  but  not  to 
members  of  other  species.  The  “probe”  episode, 
to  which  the  others  are  compared,  is  *'Spot  bit 
Jane,  causing  Jane  to  flee  from  Spof\  There 
are  five  other  episodes,  which  have  different 
combinations  of  types  of  similarity  to  the  probe 
(all  share  predicates  with  the  probe): 

LS  (Literal  Similarity)  ^"Fido  bit  John, 
causing  John  to  flee  from  Fido.”  (Has  both 
structural  and  superficial  similarity.) 

SF  (Surface  features)  John  fled  from  Fido, 
causing  Fido  to  bite  John.”  (Has  superficial  but 
not  structural  similarity.) 

CM  (Cross-mapped  analogy)  ^'Fred  bit 
Rover,  causing  Rover  to  flee  from  Fred.”  (Has 
both  structural  and  superficial  similarity,  but 
types  of  corresponding  objects  are  switched.) 

AN  (Analogy)  ""Mort  bit  Felix,  causing 
Felix  to  flee  from  Mort.”  (Has  structural  but  not 
superficial  similarity). 

FOR  (First-order-relations  only)  ""Mort fled 
from  Felix,  causing  Felix  to  bite  Mort.”  (Has 
neither  structural  nor  superficial  similarity,  oth¬ 
er  than  shared  predicates.) 

It  is  generally  accepted  that  in  adults,  struc¬ 
tural  similarity  plays  a  large  role  in  analogical 
mapping  and  conscious  similarity  judgements. 
The  role  of  structural  similarity  in  retrieval  is 
less  clear:  some  researchers  argue  that  structural 
similarity  usually  has  little  effect  on  retrieval 
(Centner,  Rattermann,  and  Forbus,  1993)  while 
others  argue  that  under  some  circumstances, 
structural  similarity  can  influence  retrieval 
(Wharton  et  al,  1994).  Others  suggest  that  struc¬ 
tural  sirhilarity  matters  only  when  the  entities 
involved  in  the  situations  share  superficial  fea¬ 
tures  (Ross,  1989).  Overall,  the  general  pattern 
for  retrievability  of  items  from  long  term  mem¬ 
ory  seems  to  be  LS  >  CM  >  SF  >  AN  >  FOR. 

Existing  computational  models  of  human 
performance  on  analog  retrieval  tasks  such  as 
ARCS  (Thagard  et  al,  1990),  and  MAC/FAC 
(Forbus,  Centner  and  Law,  1994)  have  ex¬ 


plained  the  human  retrieval  data  by  invoking 
two  processes.  The  first  is  a  simple  one  based 
on  superficial  similarities.  This  explains  much 
of  the  human  performance,  but  cannot  account 
for  effects  of  structural  similarity.  Thus,  these 
models  require  a  second  process  that  takes 
structural  similarity  into  account,  which  in¬ 
volves  additional  complex  computation.  In  this 
paper  I  will  argue  that  HRRs  can  provide  a  sin¬ 
gle-stage  model  based  on  vector-matching  that 
explains  the  pattern  of  retrieval  ability  observed 
in  people. 

3.  VECTOR  REPRESENTATIONS 
AND  OPERATIONS 

The  two  vector  operations  commonly  used 
with  vector  representations  are  superposition 
(i.e.,  addition)  and  similarity  (i.e.,  dot-product 
or  cosine).  These  two  vector  operations,  and 
other  scalar-vector  operations  such  as  scaling 
and  normalization,  are  sufficient  for  interest¬ 
ing  and  useful  memory  models.  With  the  addi¬ 
tion  of  the  circular  convolution  operation  for 
binding,  one  can  encode  associations  in  vector 
patterns  which  and  thus  encode  structure. 

Local  &  distributed  representations 

Vector  representations  come  in  two  fla¬ 
vors:  local  and  distributed.  In  some  respects, 
localist  and  distributed  representations  are 
equivalent.  They  can  be  indistinguishable 
when  features  are  numerous  and  fine-grained. 
Also,  localist  representations  can  be  mapped 
to  distributed  ones  by  a  simple  linear  map,  and 
back  by  a  thresholded  linear  map.  However,  a 
crucial  difference  is  that  the  total  number  of 
possible  features  is  limited  to  the  vector  di¬ 
mensionality  in  localist  representations,  but  is 
exponential  in  vector  dimensionality  in  dis¬ 
tributed  representations.  This  gives  distribut¬ 
ed  representations  the  capacity  to  represent 
combinatorial  features  (such  as  Wharton's  et 
al  (1994)  sour-grapes  feature  “thing  that  is 
desired  but  can't  be  obtained  and  hence  is  den¬ 
igrated”)  in  a  moderate  sized  vector. 

What  is  needed  is  a  systematic  way  of  gen¬ 
erating  and  decoding  the  patterns  which  repre- 
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sent  combinatorial  features.  This  is  the  role  of 
the  binding  operation.  As  a  binding  operation, 
circular  convolution  provides  a  fast,  systemat¬ 
ic,  and  reversible  way  of  constructing  new  pat¬ 
terns  to  represent  combinatorial  features. 

J.2  Circular  convolution 

Circular  convolution  maps  two  real-valued 
n-dimensional  vectors  onto  one.  If  x  and  y  are 
w-dimensional  vectors  (subscripted  0  to  n-1), 
then  the  elements  of  z  =  x^y  arc 

fl-i 

*=0 

where  subscripts  are  taken  modulo-n.  and  ® 
denotes  circular  convolution.  Circular  convo¬ 
lution  can  be  viewed  as  a  compression  of  the 
outer  (or  tensor)  product  of  the  two  vectors,  as 
shown  in  Figure  1.  Each  of  the  small  circles 
represents  an  element  of  the  outer  product  of  x 
and  y,  e.g.,  the  middle  bottom  one  is  jr^y,.  The 
elements  of  the  circular  convolution  of  x  and  x 
are  the  sums  of  the  outer  product  elements  along 
the  wrapped  diagonal  lines. 

Circular  convolution  can  be  regarded  as  a 
multiplication  operator  for  vectors  and  has  many 
algebraic  properties  in  common  with  scalar  and 
matrix  multiplication.  It  is  commutative 
(x<8>y=y(8)x),  associative  (x®(y®z)=(x®y)®z), 
and  bilinear  (x®(ay+Pz)=ax!S)z+Px®z).  There 
is  an  identity  vector  I  (I(»x=x)  and  a  zero  vector 
0  (0®x=6)*  Inverses  v'  exist  for  most  vectors 
(x''g>x=I). 


An  association  between  two  items  x  and 
y  can  be  represented  by  the  convolution  of  the 
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two  items:  x®y.  The  inverse  vector  of  x  can 
be  used  to  reconstruct  y  from  x®y:  x*'® 
(x®y)=y.  However,  except  under  certain  re¬ 
strictive  conditions,  the  inverse  is  numerical¬ 
ly  unstable  and  is  not  always  the  best  choice 
for  decoding.  For  vectors  which  have  random¬ 
ly  chosen  elements  independently  distributed 
as  N(0,l/n)  (the  normal  distribution  with  mean 
0  and  variance  \hi)  there  is  an  approximate 
inverse  with  attractive  properties.  The  approx¬ 
imate  inverse  of  x  is  denoted  by  x^  It  is  a  sim¬ 
ple  rearrangement  of  the  elements  of  x:  x 

^  where  subscripts  are  modulo-/?.  The  approx¬ 
imate  inverse  is  simple  to  compute  and  is  nu¬ 
merically  stable.  Reconstruction  using  the 
approximation  inverse  is  noisy.  The  convolu¬ 
tion  product  x^S>x®y  can  be  written  as  y+T], 
where  the  T)  can  be  considered  as  zero-mean 
noise  whose  magnitude  (variance)  decreases 
with  increasing  vector  dimension. 

Multiple  associations  can  be  represented  by 
the  sum  of  the  individual  associations.  Forex- 
ample,  suppose  X,  y,  v,  and  w'  are  all  randomly 
chosen  vectors  with  elements  independently 
distributed  as  N(0,1//?).  The  association  of  x 
with  y  and  v  with  w  can  be  represented  by 
*=x®y+v®w.  To  find  what  is  associated  with  x 
we  convolve  z  with  x*'^.  The  result  can  be  ex¬ 
pressed  as  x^®x®'y+x^®v®w.  The  first  term  is 
approximately  equal  to  y  and  the  second  term 
can  be  regarded  as  noise  -  it  will  not  be  highly 
correlated  with  any  of  x,  y,  v,  or  w.  The  sum  of 
the  two  terms  will  be  recognizable  as  a  distort¬ 
ed  version  of  y. 

3.3  Similarity  preservation  and  randomh/ation 

Convolution  preserv'cs  both  similarity  and 
lack  of  similarity  in  a  multiplicative  fashion: 
the  similarity  of  two  role-filler  binding  patterns 
is  approximately  equal  to  the  product  of  the  sim¬ 
ilarities  of  the  respective  role  and  filler  patterns 
(provided  that  the  role  patterns  arc  not  similar 
to  the  filler  patterns.)  Thus,  if  two  bindings  have 
the  same  role,  their  similarity  will  be  equal  to 
that  of  the  fillers.  Conversely,  if  two  roles  have 
no  similarity,  bindings  involving  them  will  have 
similarity  regardless  of  the  fillers.  Furthermore, 
convolution  is  randomizing  in  that  role-filler 
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binding  patterns  are  not  similar  to  either  the  role 
or  filler  patterns. 

3,4  HRRs  for  relational  structure 

Consider  representing  a  nested  proposition 
such  as  “Spot  bit  Jane,  causing  Jane  to  flee  from 
Spot”  in  a  vector  pattern.  We  would  like  this 
pattern  to  faithfully  record  structure  and  also 
to  be  suitable  for  detecting  at  least  superficial 
similarity  by  computing  dot-products. 

The  structure  of  a  proposition  can  be  rep¬ 
resented  by  superimposing  patterns  represent¬ 
ing  the  predicate  name  and  the  role-filler  bind¬ 
ings.  This  provides  a  structural  skeleton  that 
faithfully  records  structure. 

The  skeleton  HRR  for  the  proposition 
“Spot  bit  Jane”  is  constructed  as  follows: 

Kp  .yj=bite+bite^g^0spot+bite^y(8>jane 

The  pattern  bite  represents  the  predicate 
label,  bite^gj  andbite^^^^  its  roles,  and  spot  and 
jane  the  entities  “Spot”  and  “Jane”.  If  we 
have  the  pattern  and  know  the  role  pat¬ 
terns,  then  we  can  reconstruct  the  filler  pat¬ 
terns  by  convolving  K,,  with  the  approx¬ 
imate  inverses  of  the  role  patterns.  For  exam¬ 
ple,  bite^,j'*^<8)Kj,yj^  gives  a  noisy  version  of  spot 
which,  if  necessary,  can  be  cleaned  up  using  an 
auto-associative  item  memory.  The  pattern  bite 
is  made  a  component  of  Kp  in  order  to  iden¬ 
tify  it  as  a  bite  proposition  and  thus  allow  a 
system  to  deduce  that  the  appropriate  role  pat¬ 
terns  for  decoding  are  bite^^^  and  bite^^j^.. 

The  skeleton  HRR  pattern  for  the  proposi¬ 
tion  “Spot  bit  Jane”  is  an  ^-dimensional  pat¬ 
tern  just  like  the  patterns  spot,  bite,  etc.  Thus, 
it  is  easily  used  as  a  filler  in  a  higher-order  prop¬ 
osition.  For  example,  the  skeleton  HRR  Kp  rep¬ 
resenting  “Spot  bit  Jane,  which  caused  Jane  to 
flee  from  Spot”  is  constructed  as  follows: 

The  other  goal  for  a  vector  representation 
was  that  patterns  should  reflect  superficial  sim¬ 
ilarity,  i.e.,  two  patterns  should  be  similar  if  the 
structures  they  represent  merely  involve  simi¬ 
lar  fillers  or  predicates.  The  presence  of  predi¬ 


cate  labels  in  HRRs  ensures  that  patterns  for 
the  same  predicate  are  similar.  However,  skel¬ 
eton  HRRs  do  not  behave  as  desired  with  re¬ 
spect  to  the  presence  of  similar  fillers:  the  ran¬ 
domizing  properties  of  convolution  mean  that 
roIej<8>fillerj  is  only  similar  to  role^^fillerj  to 
the  extent  that  role^  is  similar  to  role^  and  fill- 
CFj  is  similar  to  filler^.  HRRs  are  easily  made 
to  reflect  superficial  similarity  by  superimpos¬ 
ing  the  filler  patterns  together  with  the  struc¬ 
tural  skeleton  HRR.  Thus,  the  fleshed-out  HRR 
for  “Spot  bit  Jane”  is  as  follows: 

+  bite^^^(S>spot  +  bite  ^^j<8)jane 

Adding  in  the  fillers  makes  decoding  more 
noisy,  but  does  not  prevent  successful  decod¬ 
ing.  For  higher  level  propositions,  the  same  idea 
of  adding  in  fillers  can  be  applied  recursively. 
For  example,  the  HRR  for  “Spot  bit  Jane,  caus¬ 
ing  Jane  to  flee  from  Spot”  is  constructed  as 
follows: 


flee  +  spot  +  jane  + 

+  flee^j,®jane  +  flee,„,^®spot 
cause+P. +  P.  + 

bite  flee 

+  cause.  +  cause  <S)P. 


HRRs  constructed  like  this  will  be  similar  if 
they  merely  involve  similar  entities  or  predicates. 
Because  of  the  similarity  preserving  properties 
of  convolution,  they  will  be  even  more  similar 
if  the  entities  are  involved  in  similar  roles. 


3,5  The  need  for  a  ^*clean~up”  memory 

Convolution  encodings  are  remarkably 
compact:  a  number  of  associations  between  n- 
dimensional  patterns  packed  into  one  w-dimen- 
sional  pattern.  The  price  we  pay  for  this  com¬ 
pactness  is  noise  in  decoded  vectors.  Conse¬ 
quently,  if  we  want  a  convolution-based  asso¬ 
ciative-memory  model  to  provide  accurate  re¬ 
constructions  of  decoded  patterns,  it  must  be 
equipped  with  an  additional  error-correcting 
auto-associative  item  memory.  This  can  clean 
up  the  noisy  patterns  retrieved  from  the  convo¬ 
lution  encodings.  This  clean-up  memory  must 
store  all  the  items  that  the  system  can  produce. 
When  given  a  noisy  version  of  one  of  those 
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items  it  must  either  output  the  closest  item  or 
indicate  that  the  input  is  not  close  to  any  of  the 
stored  items.  Note  that  only  a  few  associations 
are  stored  as  convolution  encodings  in  a  single 
pattern,  whereas  many  patterns  are  stored  in  the 
clean-up  memory. 

3,6  Normalization 

The  final  point  to  consider  when  construct¬ 
ing  HRRs  is  maintaining  the  overall  strength  of 
patterns  and  the  statistical  distribution  of  their 
elements.  The  easiest  way  to  do  this  is  to  nor¬ 
malize  all  patterns  to  have  a  Euclidean  length  of 
one.  Here,  the  normalized  version  of  the  vector 
X  is  denoted  by  (x)  and  is  defined  as  follows: 

4.  ESTIMATING  SIMn.ARITY 

The  six  “dog  bites  human”  episodes  pro¬ 
vide  a  simple  demonstration  that  HRR  scores 
can  reflect  similarity  of  structural  arrangements, 
as  well  as  similarity  of  surface  features.  It  also 
demonstrates  that  a  model  based  just  on  HRR 
similarity  scores  can  neatly  explain  the  pattern 
of  human  retrieval:  LS>  CM  ^SF>  AN  >  FOR. 

The  HRRs  for  the  probe  (P)  and  the  literally 
similar  episode  are  constructed  as  follows: 

Pwic  “  (bite  +  (spot  +  jane) 

+  blte^^j^spot  +  blte^^.®janc) 

Pfl^  =  (flee  +  (spot  +  jane) 

+  nee,p®janc  +  nec,,  „„®spol> 

P  =  (cause +  <P,„,+  P,„> 

+  causc,„,  ®P„,,  +  causc„,,®P^> 

+  blte^p®spot  +  blte^^,®janc) 

ELs.n«=  (flee +“<spot+ Jane)' 

+  flec,^,®Janc  +  nec,„  ,„®spol> 

=  (cause +  <P,„,+  P,^> 

+  cau.se.„,  ®Py„ + cause^„„®P J 

The  HRRs  for  the  other  episodes  are  built 
in  an  analogous  fashion.  The  patterns  for  mem¬ 
bers  of  the  same  species  (types)  are  designed  to 
be  similar.  The  complete  set  of  base  vectors  and 
tokens  used  in  this  experiment  is  shown  in  Ta¬ 
ble  1 .  All  base  and  identity  (id)  vectors  were 


randomly  chosen  with  elements  independently 
distributed  as  N(0,l/;i). 

Average  HRR  similarity  scores  arc  shown 
in  Table  2.  These  are  from  100  runs  with  dif¬ 
ferent  random  base  and  identity  vectors,  and  a 
vector  dimension  of  2048.  The  directions  of 
differences  between  average  similarity  scores 
were  reliable  -  the  standard  deviation  of  the 
scores  ranged  between  0.016  and  0.026. 


1  Base  vectors 

Token  vectors 

person 

bite 

jane  =  (person  +  Id,  ,„,) 

do}> 

flee 

John  =  (person  + 

cat 

cause 

fred  =  (person  +  ld„,,,) 

mouse 

spot  =  (dos  +  Id,^,) 
fido  =  (doR  +  ld|,j„> 

blfc,rt 

bltc,.M 

rover  =  (dog  +  ld,.,„r) 

nccjc, 

nee..N 

felix  =  (cat  +  ldi,i„) 

cause,,,. 

cause,,^ 

mort  =  (mouse  +  IdH,.,^) 

Tahh  /. 

For  comparison,  MAC-stylc  similarity 
scores  are  also  shown.  These  are  modeled  after 
the  MAC  stage  of  Forbiis  et  ats  (1994)  MAC/ 
FAC  model.  They  are  based  on  the  dot  product 
of  normalized  content  vectors  over  the  follow¬ 
ing  features:  person,  dog,  mouse,  cat,  cause, 
bite,  and  flee.  For  example,  the  content  vector 
for  the  probe  is  ( 1 , 1 ,0,0, 1,1,1  )/V5 . 

The  pair  of  episodes  E^^  and  E^.^  each  have 
the  same  surface  commonalities  (object  features 
and  predicate  names)  with  the  probe.  The  dif¬ 
ference  between  them  is  that  E^^  is  structurally 
isomorphic  to  the  probe,  while  E^^  is  not.  Be¬ 
cause  there  is  no  structural  information  beyond 
predicates  names  encoded  in  content  vectors, 
Ej^  and  E^p  have  the  same  content- vector  sim¬ 
ilarity  to  the  probe.  On  the  other  hand,  the  HRR 
similarity  scores  indicates  that  E^^  is  more  sim¬ 
ilar  to  the  probe  than  E^.^. 

When  episodes  do  not  share  object  at¬ 
tributes  with  the  probe,  HRR  scores  arc  low  and 
do  not  always  reflect  structural  match.  Although 
in  Table  2  the  HRR  score  for  E^^.  is  higher  than 
for  EpQp  (due  to  the  “bite”  and  “flee”  proposi¬ 
tions  filling  the  same  roles  in  E^^.  as  in  the 
probe),  this  difference  is  not  reliable.  It  is  pos¬ 
sible  to  construct  other  FOR  examples  that  have 
a  higher  score  than  AN  examples  (Plate,  1 994), 
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is  a  cross-mapped  analogy.  It  has  the 
same  structure  and  types  of  objects  as  the 
probe,  but  unlike  and  the  probe,  the  simi¬ 
lar  objects  do  not  map  to  each  other  (the  dog 
maps  to  the  person,  and  the  person  maps  to 
the  dog).  Since  HRR  similarity  scores  are  sen¬ 
sitive  to  having  similar  objects  fill  similar 
roles,  has  a  lower  HRR  similarity  to  the 
probe  than  In  contrast,  the  content- vector 
similarities  of  E^,^  and  E^^g  to  the  probe  are 
the  same. 

4,1  Why  HRR  dot-products  reflect 
structural  similarity 

HRR  dot-products  reflect  structural  simi¬ 
larity  because  of  the  presence  of  components 
representing  combinatorial  features,  such  as 
bite_  ®spot,  cause  .  (gibite,  and  cause  . 
0bite  ,(8)spot. 

All  of  these  higher-order  features  derive 
from  role-filler  bindings.  Consequently,  the 
HRRs  described  here  reflect  differences  in 
structural  similarity  when  there  are  differences 
in  whether  similar  objects  fill  similar  roles. 
Hence  the  large  difference  between  E^j^  and 
Ej^g  in  their  HRR  similarity  scores  with  the 
probe.  Although  they  have  the  same  objects, 
and  isomorphic  structure,  E^j^  does  not  have 
similar  objects  filling  the  same  roles  as  in  the 
probe.  Thus,  E^^^  has  combinatorial  features 
like  bite^^^®person,  which  are  not  at  all  similar 
to  those  like  bite^^^^dog. 


This  pattern  of  sensitivity  to  structural  sim¬ 
ilarity,  in  which  structural  similarity  is  only 
detected  when  similar  objects  fill  similar  roles, 
is  very  similar  to  the  pattern  observed  by  Ross 
(1989)  in  experiments  with  people.  Ross  found 
that  shared  structure  enhanced  retrieval  in  the 
presence  of  similar  objects,  provided  that  cor¬ 
responding  objects  were  similar,  and  that  cross- 
mapping  inhibited  retrieval. 

5.  INTERPRETATIONS  OF 
AN  ANALOGY 

Retrieval  of  analogies  is  only  the  first  step 
in  many  analogy  processing  tasks.  After  retriev¬ 
ing  a  potentially  analogous  episode  we  may 
want  to  decode  the  structure  in  order  to  evalu¬ 
ate  more  accurately  the  degree  of  structural 
consistency,  or  to  use  the  episode  for  analogi¬ 
cal  reasoning.  The  structure  of  a  HRR  could  be 
decoded  using  the  techniques  described  in  Sec¬ 
tion  2,  and  then  used  in  a  symbolic  processor 
like  SME  or  in  some  other  connectionist  archi¬ 
tecture.  However,  some  apparently  more  sym¬ 
bolic  tasks,  like  finding  corresponding  entities, 
and  thus  deriving  an  interpretation  of  an  analo¬ 
gy,  can  be  computed  with  vector  operations 
directly  on  HRRs. 

Consider  the  probe  P  “Spot  bit  Jane,  caus¬ 
ing  Jane  to  flee  from  Spot”,  and  E^^g  “Fido  bit 
John,  causing  John  to  flee  from  Fido.”  The 
entity  corresponding  to  Jane  (which  is  John) 
can  be  found  in  two  steps: 


Commonalities  with  probe 


p 

1 

Spot  bit  Jane,  causing  Jane  to  flee  from  Spot. 

Episodes  in  long-term  memory: 

Object 

attrib¬ 

utes 

First-order 

relation 

names 

Higher-  i 

order 

structure 

Similarity 

scores 

HRR  MAC 

Els 

Fido  bit  John,  causing  John  to  flee  from  Fido. 

- v~ 

0.71 

1.0 

Esf 

John  fled  from  Fido,  causing  Fido  to  bite  John 

X 

0.47 

1.0 

Ecm 

Fred  bit  Rover,  causing  Rover  to  flee  from  Fred. 

✓ 

0.47 

1.0 

Ean 

Mort  bit  Felix,  causing  Felix  to  flee  from  Mort. 

X 

✓ 

0.42 

0.6 

Efor 

Mort  fled  from  Felix,  causing  Felix  to  bite 

X 

X 

0.30 

0.6 

Mort. 

1 

i 

Table  2, 
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1.  Extract  the  roles  Jane  fills  in  the  probe 
with  the  operation: 

Jane-roles  =  (Pojane'O 
This  pattern  is  a  blend  of  various  roles  and 
other  noise  patterns.  The  following  are  the  pos¬ 
itive  dot-products  of  the  jane-roles  pattern  with 


other  role  patterns: 

jane-roles  x  cause^^,^.  c  0,20 

jane-roles  x  couse^^^  «  0. 1 8 

jane-roles  x  flee^^,  =  0. 1 3 

jane-roles  x  blte^^  =  0.12 


2.  Use  jane-roles  to  extract  the  fillers  from 
Ej^and  compare  with  the  entities  in  E^: 
(Ej^^Jane-roles*^  X  John  =  0.38 
(Ej^^^Jane-roles**)  x  fido  =  0.05 
The  most  similar  pattern  is  John,  which  is 
in  fact  the  entity  in  E^^,  corresponding  to  Jane. 


Jane-roles') 

John 

fido 

0.38 

0.07 

<Ecm  ®  jane-roles') 

fred 

rover 

0.25 

0.17 

X 

<Ean®  janc-roles*) 

fellx 

mort 

0.16 

0.09 

✓ 

<Esf®  jane-roles') 

John 

rido 

0.23 

0.07 

7 

<Efor  ®  jane-roles') 

mort 

felix 

0.11 

0.06 

? 

TahJe  3, 


Table  3  shows  the  extraction  of  the  entities 
corresponding  to  Jane  in  the  various  episodes. 
Correct  extractions  are  checkmarked,  and  cas¬ 
es  where  there  is  no  clear  corresponding  object 
have  a  question  mark. 

The  correct  answer  is  obtained  in  E^^^,  where 
corresponding  objects  are  similar,  and  in  E^^, 
where  there  is  no  object  similarity.  This  extrac¬ 
tion  process  has  a  bias  towards  choosing  similar 
entities  as  the  corrc.sponding  ones,  which  leads 
to  a  reasonable  answer  for  E^p  and  an  incorrect 
answer  There  are  no  correct  answers  for 
Egp  and  because  there  are  no  consistent 
mapping  between  P  and  those  episodes.  How¬ 
ever,  because  of  the  bias  for  mapping  similar 
items,  Fred  is  strongly  indicated  to  be  the  entity 
in  Egp  corresponding  to  Jane.  The  only  wrong 
answer  is  given  for  the  cross-mapped  analogy 


Ecm  where  again  the  more  similar  object  is  indi¬ 
cated  to  be  the  corresponding  one.  Again,  the 
effect  of  cross-mapping  is  similar  to  that  ob¬ 
served  by  Ross  (1989)  in  people:  cross-mapping 
causes  less  accurate  mapping  performance. 

Closer  examination  of  the  extraction  pro¬ 
cess  reveals  both  the  reason  for  this  bias  and 
several  ways  of  eliminating  it,  if  that  should  be 
desired.  Consider  patterns  containing  just  two 
of  the  components  from  P  and 

F  =  cause  +  €)jane 

E'c„e  cause  +  bite  ^.®rover 

The  roles  of  Jane  in  P’  are  computed  as: 

jane-roles'  =  P’(8)Jane^ 
cause^Jane*^  +  bite 

The  role  pattern  blte^^.  (and  other  role  pat¬ 
terns  like  flee  .  and  cause  .  in  the  full  version 

ATIU' 

of  P^janc**”)  are  what  arc  wanted  here.  The  other 
patterns  like  causeSjane"^,  which  are  not  roles 
at  all,  are  the  source  of  the  same-type  bias  in 
finding  the  corresponding  object.  When  jane- 
roles'^  is  used  to  extract  the  fillers  from  E^^,, 
we  get  the  following: 

corresp  =  jane-roles'^0E'^^^ 

«  {cause(S>Janc’*^  +  blte^^.)^ 

®  (cau.se  +  blte^^,€»rover) 

*:  fanc+cause®)ane^®bite^  g>rover 
+blte^T®cause+roy£r 

This  includes  the  pattern  rover  as  de¬ 
sired,  but  also  includes  the  pattern  jane 
(from  (causc8>jane‘*^)‘*^g)cause).  Although 
corresp*  only  contains  one  term  like  this, 
there  is  a  jane  component  in  corresp  for 
every  pattern  which  is  shared  by  P  and  E^.„. 
This  adds  up  to  a  very  strong  component  of 
jane  in  corresp.  When  corresp  is  compared 
to  the  fillers  of  corresp  is  more  simi¬ 
lar  tofred  than  rover,  due  to  the  strong  jane 
component  in  corresp. 

One  way  of  eliminating  this  similar-type  bias 
is  to  perform  a  linear,  multi-way,  role-clean-up 
on  Jane-roles.  This  should  pass  all  positive  role 
components  and  suppress  negative  role  and  non¬ 
role  components  like  caiise^Jane'^.  Thus,  the 
clean  version  of  jane-roles  is  as  follows: 


160 


Structured  Operations  with  Distributed  Vector  Representations 


clean-jane-roles  = 

0.20xcause  ,  +0.18xcause 

ante  cn.sq 

+  0. 1 3x  flee  +0.1 2xbite  . 

agt  «b| 


{Ets^cleaned-jane-roles  * ) 

john 

fldo 

0.27 

0.20 

{EcM®cleaned-jane-roles  * ) 

fred 

rover 

0.20 

0.29 

(EAN®cleaned-jane-roles  ‘ ) 

felix 

mort 

0.25  . 
0.20 

{EsF®cleaned-jane-roles  ‘ ) 

john 

fido 

0.25 

0.17 

? 

(EFOR<8>cleaned-jane-roles 

mort 

felix 

0.26 

0.19 

7 

Table  4. 


The  corresponding  objects  extracted  using 
role  clean-up  are  shown  in  Table  3.  This  slightly 
slower  process  gives  correct  answers  for  the  ep¬ 
isodes  in  which  there  is  a  consistent  mapping. 

The  other  way  of  avoiding  the  similar-type 
bias  is  to  use  a  different  binding  operation,  in 
which  the  algebraic  properties  of  encoding  and 
decoding  do  not  result  in  terms  like 
(cause(2)jane'^)’’'®cause  equating  to  jane.  Pos¬ 
sible  suitable  alternative  binding  operations  are 
discussed  in  Plate  (1994). 

There  are  two  more  limitations  with  these 
fast  techniques  for  deriving  interpretations.  One 
is  that  each  corresponding  pair  in  a  mapping  is 
extracted  independently.  This  matters  when  there 
is  more  than  one  consistent  mapping.  For  exam¬ 
ple,  if  we  have  two  possible  consistent  mappings 
Y<-^B}  and  {X++B,  ToA),  then  the 
choice  of  mapping  for  X  should  constrain  the 
choice  for  Y,  but  this  will  not  be  the  case  with 
the  above  techniques.  To  overcome  this  prob¬ 
lem  requires  some  other  mechanism  for  check¬ 
ing  that  a  mapping  is  one-to-one.  The  other  prob¬ 
lem  is  that  these  techniques  fail  when  two  dif¬ 
ferent  objects  have  the  same  set  of  roles  -  in  such 
a  case  ambiguous  results  can  be  produced. 

6.  CHUNKING  &  MEMORY 
ORGANIZATION 

HRRs  provide  a  natural  method  for  chunk¬ 
ing.  In  fact,  a  model  based  on  HRRs  must  use 
chunking  if  it  is  to  store  structures  of  unlimited 


size.  Chunking  involves  storing  sub-structures 
in  the  item  memory,  and  using  them  when  de¬ 
coding  components  of  complex  structures.  For 
example,  to  decode  the  agent  of  the  cause  ante¬ 
cedent  of  P  we  first  extract  the  cause  anteced¬ 
ent  pattern.  This  gives  a  noisy  version  of 
which  can  be  cleaned  up  by  accessing  item 
memory  and  retrieving  the  closest  match.  Now 
we  have  an  accurate  version  of  P^^^  from  which 
we  can  extract  the  filler  of  the  agent  role. 

To  use  chunks  there  hfiust  be  a  way  of  re¬ 
ferring,'  or  pointing  to  the  chunks.  In  content- 
addressable  memory  in  general,  “pointers”  to 
sub-chunks  cannot  be  addresses,  but  must 
somehow  hint  at  the  contents  of  the  sub-chunk. 
In  HRRs,  a  decoded  filler  or  sub-chunk,  which 
is  derived  from  a  chunk  by  decoding  with  a  role 
pattern,  functions  as  an  associative  “pointer” 
to  a  pattern  in  item  memory.  These  associative 
pointers  are  different  from  conventional  point¬ 
ers  in  that  their  form  conveys  information  about 
their  referent,  information  that  is  noisy  but  im¬ 
mediately  available  without  the  need  to  access 
memory.  The  advantage  of  having  pointers  that 
encode  information  about  their  referents  is  that 
some  operations  can  be  performed  without  fol¬ 
lowing  the  pointer.  This  can  save  miich  time. 
For  example,  we  can  decode  nested  fillers 
quickly  if  very  noisy  results  are  acceptable,  or 
we  can  get  an  estimate  of  the  similarity  of  two 
structures  without  decoding  them. 

6,1  Overall  memory  organization 

In  a  system  that  uses  HRRs  there  must  be 
two  levels  of  memory  organization.  One  level 
encodes  the  structure  in  and  among  chunks.  The 
other  level  stores  large  numbers  of  chunks  (the 
large-scale  clean-up  memory). 

Convolution  encoding  is  most  suited  for 
encoding  structure  in  and  among  small  chunks 
in  memory.  Because  of  its  memory  capacity 
characteristics  and  noise  in  retrieval,  convolu¬ 
tion  does  not  provide  a  suitable  associative 
memory  technique  for  the  clean-up  memory, 
which  must  store  all  the  chunks.  For  this  pur¬ 
pose  we  require  some  sort  of  large-scale  error- 
correcting  auto-associative  memory.  This  large- 
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scale  memory  should  have  the  following  prop¬ 
erties: 

auto-associative  8c  error-correcting  ability: 
when  given  a  pattern,  it  should  return  ac¬ 
curately  the  closest  one(s)  stored,  and 

high  capacity:  the  number  of  patterns 
which  can  be  stored  should  be  exponential 
in  the  size  of  the  patterns. 

There  are  several  ways  the  clean-up  mem¬ 
ory  could  be  implemented,  c.g.,  Kanerva*s 
(1988)  sparse  distributed  memory,  and  Baum 
et  aVs  (1988)  various  content  addressable 
schemes. 

7.  DISCUSSION 

This  paper  has  described  a  scheme  for  en¬ 
coding  structure  in  vector  representations  based 
on  circular  convolution.  Other  approaches,  such 
as  Smolensky's  (1 990)  tensor  products,  Pollack¬ 
's  (1990)  RAAMs,  Kanerva's  (1996)  binary 
spattercodes,  have  much  in  common  -  see  Plate 
(1997)  for  a  discussion  -  and  could  also  be  used 
in  models  of  analogy  processing. 

The  origin  of  patterns  representing  types 
such  as  'dog',  'cat'  and  'human'  must  be  addressed 
at  some  stage.  One  possible  automatic  technique 
for  learning  such  patterns  is  Latent  Semantic 
Analysis  (LSA),  which  learns  high-dimensional 
vector  patterns  for  words  from  large  quantities 
of  text  (Landauer,  Laham,  and  Foltz  1 998).  These 
patterns  reflect  human  similarity  judgements  and 
could  easily  be  used  with  HRRs. 

The  existence  of  a  fast  technique  for  com¬ 
puting  good  guesses  at  object  correspondences 
suggests  a  new  model  for  analogical  mapping. 
Mapping  could  be  done  by  “guessing”  correspon¬ 
dences  while  stepping  through  the  components 
of  two  structures  and  verifying  that  the  proposed 
correspondences  are  consistent.  This  would  re¬ 
quire  three  mechanisms,  one  for  traversing  struc¬ 
tures,  another  for  guessing  correspondences,  and 
the  last  for  storing  correspondences  and  check¬ 
ing  their  consistency.  All  can  be  implemented 
with  operations  on  vector  representations.  Such 
a  model  differs  from  ACME  and  SME  in  that  it 
puts  complexity  at  a  different  level.  The  top  lev¬ 


el  involves  simple  sequential  computation  (tra¬ 
versing  a  structure  and  checking  for  mapping 
inconsistencies)  rather  than  complex  structural 
matching  or  construction  of  special  networks, 
while  the  bottom  level  involves  information-rich 
vector  processing  to  measure  similarities  and 
estimate  correspondences. 

8.  CONCLUSION 

Holographic  Reduced  Representations  pro¬ 
vide  a  useful  vector  representation  for  analog 
retrieval  and  processing  tasks.  They  provide 
chunking,  which  will  be  essential  in  vector- 
based  model  that  stores  large  structures.  They 
also  support  fast  operations  for  computing  sim¬ 
ilarity  and  object  correspondences.  These  fast 
operations  appear  to  have  the  right  amount  of 
power  for  modeling  human  abilities:  their 
strengths  and  weaknesses  follow  a  similar  pat¬ 
tern  to  human  performance  on  various  analogy 
tasks.  In  particular,  HRRs  provide  a  simple,  sin¬ 
gle-stage  model  of  human  performance  on  an¬ 
alog  retrieval:  HRR  dot-products  arc  sensitive 
to  superficial  similarity,  and  also  to  structural 
similarity  in  situations  where  corresponding 
roles  have  similar  fillers,  which  is  the  same 
pattern  of  performance  as  demonstrated  by  hu¬ 
man  subjects  on  analog  retrieval  tasks. 
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ABSTRACT 

This  paper  is  about  the  computer  analogy 
of  the  brain  and  how  it  can  both  help  and  hinder 
our  understanding  of  the  human  mind.  It  is 
based  on  the  assumptions  that  the  mind  can  be 
understood  in  terms  of  the  working  of  the  brain, 
and  that  the  brains  function  is  to  process  infor¬ 
mation:  that  it  is  some  kind  of  a  computer,  as 
contrasted  for  example  with  the  heart  which  is 
a  pump.  It  is  a  computer  whose  design  we  do 
not  understand  but  try  to,  by  analogy;  that  is, 
by  making  a  model  -  a  "cognitive  computer^*  - 
based  on  our  understanding  of  computers, 
brains,  and  the  working  of  the  mind. 

Human  intelligence  and  language  are  funda¬ 
mentally  analogical  and  figurative  whereas  low¬ 
er  forms  of  intelligence  and  conventional  com¬ 
puters  treat  meaning  literally.  Therefore,  the  chal¬ 
lenge  in  designing  a  cognitive  computer  is  to  find 
the  kinds  of  information  representation  and  oper¬ 
ations  that  make  figurative  meaning  come  out 
naturally.  The  paper  discusses  holistic  represen¬ 
tation,  which  is  unconventional  and  looks  prom¬ 
ising  and  worthy  of  investigation  -  it  easily  en¬ 
codes  recursive  (list)  structure,  for  example  -  and 
points  out  a  danger  in  taking  too  literally  cogni¬ 
tive  models  that  have  been  developed  on  conven¬ 
tional  computers,  such  as  the  following  of  niles. 

INTRODUCTION 

The  human  mind  is  unlike  any  computer 
or  program  we  know.  It  is  not  literal,  and  when 
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meaning  is  taken  literally,  the  result  can  be  fun¬ 
ny  or  total  nonsense.  Thats  the  humor  of  puns. 
Tliis  must  mean  that  the  human  mind,  although 
capable  of  being  literal,  is  fundamentally  figu¬ 
rative  or  symbolical  or  analogical.  How  else 
could  we  judge  a  literal  interpretation  as  being 
at  once  both  accurate  and  wrong? 

The  growth  of  the  human  mind  -  our  grasp 
of  things  -  is  largely  due  to  analogical  perceiv¬ 
ing  and  thinking.  Some  things  are  meaningful 
to  us  at  birth  or  without  learning;  they  arc  mostly 
things  necessary'  for  survival.  The  rest  we  learn 
through  experience.  Some  learning  is  associa¬ 
tive,  as  when  we  learn  cause  and  effect.  This 
kind  of  learning  is  basic  to  all  animals. 

To  follow  an  example,  or  imitate,  is  a  more 
advanced  form  of  learning  and  is  common  at 
least  in  mammals  and  birds.  It  involves  a  basic 
form  of  analogy.  The  learner  identifies  with  a 
role  model  -  perceives  one  as  the  other,  makes 
an  analogical  connection  or  mapping  between 
oneself  and  the  other. 

Full-fledged  analogy  is  central  to  human 
intelligence.  We  relate  the  unfamiliar  to  the 
familiar,  and  we  sec  the  new  in  terms  of  the 
old.  This  is  most  evident  in  language,  which  is 
thoroughly  metaphorical.  New  and  unfamiliar 
things  are  expressed  and  explained  in  familiar 
terms  that  are  understood  not  literally  but  figu¬ 
ratively.  It  is  possible  that  full-fledged  analogy 
and  human  language  need  each  other  and  that 
our  faculties  for  them  have  coevolvcd. 

Analogy  is  such  an  integral  part  of  us  that 
we  hardly  notice  it  nor  pay  it  its  proper  dues. 
That  is,  until  we  try  to  program  a  computer  to 
act  like  a  human.  AT  has  puzzled  over  the  pro¬ 
gramming  of  humanlike  behavior  for  three  de¬ 
cades.  At  first  it  was  thought  that  programming 
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computers  to  understand  language,  to  translate, 
and  the  like,  were  just  around  the  corner,  wait¬ 
ing  only  for  computers  to  get  large  and  fast 
enough.  Now  they  are  large  and  fast,  many 
things  have  been  tried  and  much  has  been 
learned,  but  the  puzzle  remains  and  we  have 
no  clear  idea  of  how  to  solve  it. 

This  paper  is  a  personal  view  of  the  les¬ 
sons  this  holds  for  us.  The  theme  is  that  we  must 
rethink  computing,  put  figurative  meaning  and 
analogy  at  its  center,  and  find  computing  mech¬ 
anisms  that  make  it  come  out  naturally.  This 
can  be  construed  as  designing  a  new  kind  of 
computer,  a  “cognitive  computer,”  that  is  abet¬ 
ter  model  of  the  brain  than  present-day  com¬ 
puters  are.  I  will  also  try  to  verbalize  things  that 
students  of  connectionist  architectures  take  for 
granted  but  that  might  puzzle  others,  the  main 
idea  being  that  implementation  matters  when 
we  try  to  understand  how  the  mind  works. 

THE  COMPUTER  AS  A  BRAIN 
AND  THE  BRAIN  AS  A  COMPUTER 

Equating  computers  with  brains  is  an  ex¬ 
ample  of  analogical  thinking.  Early  comput¬ 
ers  were  dubbed  electronic  brains,  comput¬ 
ers  have  memory,  and  we  even  say  that  a  pro¬ 
gram  knows,  wants,  or  believes  so  and  so. 
Such  anthropomorphizing  seems  natural  to 
us  and  it  serves  a  purpose.  It  brings  a  techno¬ 
logical  mystery  within  the  realm  of  the  fa¬ 
miliar,  since  we  already  have  an  idea  of  what 
the  brain  does  even  if  we  dont  know  j ust  how 
it  does  it. 

We  also  talk  of  the  brain  as  a  computer. 
Its  appeal  is  in  that  whereas  the  mechanisms 
of  the  brain  are  hidden,  those  of  the  comput¬ 
er  are  available  to  us,  and  through  them  we 
could  possibly  understand  the  brains  mecha¬ 
nisms.  The  principle  is  sound  and  is  the  the¬ 
sis  behind  Turings  imitation  game:  If  we  can 
build  a  machine  that  behaves  in  the  same  way 
as  a  natural  system  does,  we  have  understood 
the  natural  system. 

Analogies  not  only  help  our  thinking  but 
they  also  channel  and  limit  it.  The  computer 


analogy  of  the  brain  or  of  the  mind  has  cer¬ 
tainly  done  so,  as  modeling  in  cognitive  sci¬ 
ence  and  AI  has  been  dominated  by  programs 
written  for  the  computer,  while  philosophical 
and  qualitative  treatment  of  issues  is  looked 
upon  with  suspicion. 

Many  things  are  modeled  successfully  on 
computers,  such  as  weather,  traffic  flow, 
strength  of  materials  and  structures,  indus¬ 
trial  processes,  and  so  forth.  However,  there 
are  special  pitfalls  when  the  thing  being  mod¬ 
eled  "the  brain"  is  itself  some  kind  of  a  com¬ 
puter:  the  danger  is  that  our  models  begin  to 
look  like  the  computers  they  run  on  or  the 
programming  languages  they  are  written  in. 
For  example,  we  talk  of  human  short-term  or 
working  memory  and  think  of  the  computers 
active  registers,  or  we  talk  of  human  long¬ 
term  memory  and  think  of  the  computers  per¬ 
manent  storage  (RAM  or  disk),  or  we  talk  of 
the  grammar  of  a  language  and  think  of  a  tree- 
structure  or  a  set  of  rewriting  rules  pro¬ 
grammed  in  Lisp.  Of  course  these  are  ana¬ 
logical  counterparts,  but  there  is  a  danger  of 
taking  them  too  literally.  Human  memory 
works  very  differently  from  computer  mem¬ 
ory,  and  the  brain  is  not  a  Lisp  machine  nor 
the  mind  a  logic  program.  Some  analogical 
comparisons  have  not  been  at  all  useful  in  un¬ 
derstanding  the  working  of  the  mind;  for  ex¬ 
ample,  equating  the  brain  with  the  computers 
hardware  and  the  mind  with  its  software.  Fi¬ 
nally,  there  is  a  worse  danger  of  failing  to  no¬ 
tice  what  is  missing  in  our  models  of  the  mind 
because  it  is  missing  or  invisible  in  comput¬ 
ers.  To  safeguard  against  it,  we  must  treat  the 
subject  qualitatively:  Our  models  may  behave 
as  advertised,  but  is  that  how  people  behave; 
for  example,  how  they  use  language? 

ARTIFICIAL  NEURAL  NETS 

AS  BIOLOGICALLY  MOTIVATED 
MODELS  OF  COMPUTING 

The  computers  and  brains  architectures  are 
very  different.  Perhaps  the  differences  account 
for  the  difficulty  in  programming  computers  to 
be  more  lifelike  and  less  literal-minded.  This 
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has  motivated  the  study  of  alternative  comput¬ 
ing  architectures  called  (artificial)  neural  nets 
(NN),  or  parallel  distributed  processing  (PDP), 
or  connectionist  architectures.  The  hope  is  that 
an  architecture  more  similar  to  the  brains  should 
produce  behavior  more  similar  to  the  brains, 
which  is  a  valid  analogical  argument.  Unfortu¬ 
nately  it  does  not  tell  us  what  in  the  architec¬ 
ture  matters  and  what  is  incidental,  and  unfor¬ 
tunately  our  neural  nets  are  not  significantly 
more  figurative  than  traditional  computers. 

Neural-net  research  has  made  a  valuable 
contribution  by  focusing  our  attention  on  rep¬ 
resentation.  Computer  theoreticians  and  engi¬ 
neers  know,  for  example,  that  the  representation 
of  numbers  has  a  major  effect  on  circuit  de¬ 
sign.  A  representation  that  works  well  for  addi¬ 
tion  works  reasonably  well  also  for  multiplica¬ 
tion,  whereas  a  representation  that  allows  very 
fast  multiplication  is  useless  for  addition.  Thus 
a  representation  is  a  compromise  that  favors 
some  operations  and  hinders  others. 

Information  in  computers  is  stored  locally, 
that  is,  in  records  with  fields.  Local  representa¬ 
tion  -  one  unit  per  concept  -  is  common  also  in 
neural  nets.  The  alternative  is  to  distribute  in¬ 
formation  from  many  sources  over  shared  units. 
It  is  more  brainlike,  at  least  superficially,  and  it 
has  been  studied  and  used  with  neural  nets  for 
a  long  time.  I  take  distributed  representation  to 
be  fundamental  to  the  brains  operation  and  be¬ 
lieve  that  a  cognitive  computer  should  be  based 
on  it,  and  that  therefore  we  should  find  out  all 
we  can  about  the  encoding  of  information  into, 
and  operating  with,  distributed  representations. 

Neural-net  research  has  shown  that  these 
representations  are  robust  and  support  some 
forms  of  generalization:  representations  (pat¬ 
terns)  that  are  similar  on  the  surface  -  close  ac¬ 
cording  to  some  metric  -  are  treated  similarly, 
for  example  as  belonging  in  the  same  or  simi¬ 
lar  classes.  The  representations  are  also  suit¬ 
able  for  learning  from  examples.  The  learning 
takes  place  by  statistical  averaging  orclu.stering 
of  representations  (self-organizing).  It  is  not 
very  creative  but  it  can  be  subtle  and  lifelike, 
which  makes  it  cognitively  interesting.  It  can 
produce  behavior  that  looks  like  rule-following 


although  the  system  has  no  explicit  rules,  as 
was  demonstrated  with  the  learning  of  the  past 
tense  of  English  verbs  by  Rumcihart  and  Mc¬ 
Clelland  (1986).  This  is  a  significant  discov¬ 
ery,  in  that  it  demonstrates  a  principle  that  prob¬ 
ably  governs  the  working  of  the  brain  in  gener¬ 
al  and  should  govern  the  working  of  a  cogni¬ 
tive  computer.  What  we  sec  and  describe  as 
rule-following  is  an  emergent  phenomenon  that 
reflects  an  underlying  mechanism.  However, 
the  rules  do  not  produce  the  behavior  even  if 
they  may  accurately  describe  it. 

DESCRIPTION  VS.  EXPLANATION 

The  distinction  between  description  and 
explanation  of  behavior  is  so  central  that  I  will 
highlight  it  with  an  example.  Consider  hered¬ 
ity.  Long  before  the  genetic  bases  of  heredity 
were  known,  people  knew  about  dominant  and 
recessive  traits  and  had  figured  out  the  basic 
laws  of  inheritance.  For  example,  a  plant  spe¬ 
cies  may  come  in  three  varieties,  with  white, 
pink,  or  red  flowers,  and  cross-pollinating  the 
white  with  the  red  al  w  ays  produces  plants  with 
pink  flow'ers.  The  specific  rule  is  that  all  of 
the  first  generation  is  pink,  and  when  pink- 
flowered  plans  are  crossed  w^ith  each  other, 
one-fourth  of  the  offspring  is  white,  one-fourth 
red,  and  half  pink.  So  we  can  say  that  the 
inheritance  mechanism  w'orks  by  this  rule. 
How^ever,  no  mechanism  in  the  reproductive 
system  keeps  counting  the  numbers  of  off¬ 
spring  to  make  sure  that  the  proportions  come 
out  right:  I  have  made  so  and  so  many  white 
flowers,  its  time  to  make  the  same  number  of 
red  flowers.  It  is  not  the  rule  that  makes  the 
proportions  come  out  in  a  certain  way.  The 
proportions  are  an  outward  reflection  of  the 
mechanism  that  passes  traits  from  one  genera¬ 
tion  to  the  next.  It  is  significant,  however,  that 
long  before  chromosomes  or  genes,  or  RNA 
and  DNA  were  discovered,  people  speculated 
correctly  about  a  hereditar}’  mechanism  that 
would  produce  offspring  in  those  proportions. 
Clearly,  the  laws  provided  a  useful  descrip¬ 
tion  of  the  behavior,  and  accurate  description 
often  leads  to  discovery  and  explanation. 
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The  situation  is  Similar  with  regard  to  lan¬ 
guage  and  to  mental  functions  at  large.  For  ex¬ 
ample,  we  attribute  the  patterns  of  a  language  to 
its  grammar  and  we  devise  sets  of  rules  by  which 
the  grammar  works.  However,  it  is  not  the  gram¬ 
mar  that  generates  sentences  in  us  when  we  speak 
or  write.  The  regularities  captured  in  the  gram¬ 
mar  are  an  outward  expression  of  our  underly¬ 
ing  mechanisms  for  language  -  the  grammar  is 
an  emergent  phenomenon.  This  distinction  is  eas¬ 
ily  lost  when  we  produce  language  output  with 
computers,  for  there  we  actually  use  the  gram¬ 
mar  to  generate  sentences,  and  we  work  hard  to 
develop  a  comprehensive  grammar  for  a  lan¬ 
guage.  And  when  we  think  of  the  computer  as 
a  model  of  the  brain  and  use  computers  to  model 
mental  functions,  we  tacitly  assume  that  the 
brain  uses  grammar  rules  to  generate  language. 
Formal  logic  as  a  model  of  thinking  can  be  crit¬ 
icized  on  similar  grounds.  It  may  describe  ra¬ 
tional  thought  but  it  does  not  explain  thinking. 
Our  understanding  of  the  mechanisms  of  mind 
is  not  yet  sufficient  to  allow  us  to  explain 
thinking  and  language.  The  best  we  can  do  i  s 
to  describe  them,  but  as  our  descriptions  im¬ 
prove,  our  chances  for  discovering  the  mecha¬ 
nisms  improve. 

THE  BRAIN  AS  A  COMPUTER 

FOR  MODELING  THE  WORLD, 
AND  OUR  MODEL  OF  THE  BRAIN  AS 
COMPUTING 

It  is  useful  to  think  of  the  brain  as  a  comput¬ 
er  if  we  make  the  analogy  between  the  two  suffi¬ 
ciently  abstract.  So  what  in  computers  should 
we  look  at?  The  organization  of  computation  as 
a  sequence  of  programmed  instructions  for  ma¬ 
nipulating  pieces  of  data  stored  in  memory  seems 
like  an  overly  restricted  a  model  of  how  the  brain 
or  the  mind  works.  A  more  useful  analogy  is 
made  at  the  level  of  computers  as  state  machines, 
the  states  being  realized  as  configurations  of 
matter,  or  patterns  in  some  physical  medium. 
Mental  states  and  subjective  experience  then 
correspond  to  -  or  are  caused  by  -  physical  states 
so  that  when  a  physical  state  repeats,  the  corre¬ 
sponding  subjective  experience  repeats.  Thus  the 


patterns  that  define  the  states  are  the  objective 
counterpart  of  the  subjective  experience.  Our 
senses  are  the  primary  source  of  the  patterns,  and 
our  built-in  faculties  for  pleasure  and  pain  give 
primary  meaning  to  some  of  the  patterns.  Brains 
are  wired  for  rich  feedback,  and  when  the  feed¬ 
back  works  in  such  a  way  that  an  experience  cre¬ 
ated  by  the  senses  -  i.e.,  a  succession  of  states  - 
can  later  be  created  internally,  we  have  the  basis 
for  learning.  With  learning,  rich  networks  of 
meaningful  states  can  be  built. 

The  evolutionary  function  of  this  comput¬ 
er  is  to  make  the  world  predictable:  the  brain 
models  the  world  as  the  world  is  presented  to 
us  by  our  senses.  It  appears  to  compute  with 
patterns  of  activity  over  large  sets  of  neurons. 
To  study  such  computing  mathematically,  we 
can  model  the  patterns  by  large  patterns  of  bits, 
emphasizing  the  large  size  of  the  patterns,  as 
that  gives  the  models  their  power.  The  key  ques¬ 
tion  is,  how  do  patterns  that  have  already  been 
established  and  have  become  meaningful,  give 
rise  to  new  patterns;  how  do  existing  concepts 
give  rise  to  new  concepts. 

I  have  used  the  binary  Spatter  Code  (Kan- 
erva,  1996)  to  model  computing  with  large  pat¬ 
terns.  The  code  is  related  to  Plates  Holograph¬ 
ic  Reduced  Representation  (HRR;  Plate,  1 994) 
and  allows  simple  demonstrations  of  it.  The 
representation  is  distributed  so  that  every  item 
of  information  that  is  included  in  a  composed 
pattern  -  every  constituent  pattern  -  contrib¬ 
utes  to  every  bit  of  the  composed  pattern:  the 
patterns  are  holographic  or  holistic. 

COMPUTING  WITH  LARGE 
PATTERNS 

The  following  description  is  in  traditional 
symbolic  terms  and  uses  a  two-place  relation 
r(;c,y)  and  a  triplet  t  =  (jc,  y,  z)  as  examples. 

Space  of  Representations 

All  HRRs,  including  the  Spatter  Code,  work 
with  large  random  patterns,  or  high-dimensional 
random  vectors.  All  things  -  variables,  values, 
composed  structures,  mappings  between  struc¬ 
tures  -  are  elements  of  a  common  space:  they  are 
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very -high-dimensional  random  vectors  with  in¬ 
dependent,  identically  distributed  components 
The  dimensionality  of  the  space,  denoted  by  N,  is 
usually  between  1,000  and  10,000.  The  Spatter 
Code  uses  dense  binary  vectors  (i.e.,  Os  and  I  s  are 
equally  probable).  The  vectors  are  written  in  bold¬ 
face,  so  that  X  stands  for  an  Mvector  representing 
the  variable  or  role  j:,  and  a  stands  for  an  vector 
representing  the  value  or  filler  a,  for  example 

Item  Memory  or  Clean-up  Memory 

Some  operations  produce  approximate  vec¬ 
tors  that  need  to  be  cleaned  up  (i.e.,  identified 
with  their  exact  counterparts).  That  is  done  with 
an  item  memory  that  stores  all  valid  vectors 
known  to  the  system,  and  retrieves  the  best- 
matching  vector  when  cued  with  a  noisy  vec¬ 
tor,  or  retrieves  nothing  if  the  best  match  is  no 
better  than  what  results  from  random  chance. 
The  item  memory  performs  a  function  that,  at 
least  in  principle,  is  performed  by  an  autoasso- 
ciative  neural  memory. 

Binding 

Binding  is  the  first  level  of  composition  in 
which  things  that  are  very  closely  associated 
with  each  other  are  brought  together.  A  vari¬ 
able  is  bound  to  a  value  with  a  binding  opera¬ 
tor  that  combines  the  /^-vectors  for  the  variable 
and  the  value  into  a  single  A^-vector  for  the 
bound  pair.  The  Spatter  Code  binds  with  coor- 
dinatewise  (bitwise)  Boolean  Exclusive-OR 
(XOR,  (8»),  so  that  the  variable  at  having  the  val¬ 
uer  (i.e.,  x  =  a)  is  encoded  by  theA^-vectorx®a 
whose  nth  bit  is  the  bitwise  XOR  jr0n  (x  and 

n  n 

are  the  nth  bits  of  x  and  a,  respectively).  An 
important  property  of  all  HRRs  is  that  binding 
of  two  random  vectors  produces  a  random  vec¬ 
tor  that  resembles  neither  of  the  two. 

Unbinding 

The  inverse  of  the  binding  operator  breaks 
a  bound  pair  into  its  constituents:  finds  the  fill¬ 
er  if  the  role  is  given,  or  the  role  If  the  filler  is 
given.  The  XOR  is  its  own  inverse  function,  so 
that,  for  example,  (x0a)0a  =  x  finds  the  vec¬ 
tor  to  which  a  is  bound  in  x0a. 


Merging 

Merging  is  the  second  level  of  composi¬ 
tion  in  which  identifiers  and  bound  pairs  are 
combined  into  a  single  item.  It  has  also  been 
called  ‘superimposing’  (superposition),  ’bun¬ 
dling’,  and  ’chunking*.  It  is  done  by  a  {normaf- 
i:cd)  mean  vector,  and  the  merging  of  G  and  H 
is  written  as  [G  +  H],  where  (...]  stands  for 
normalization.  The  relation  tin,  h)  can  be  repre¬ 
sented  by  merging  the  repre.sentations  for  r, 
=  a*,  and  *y  a^hWi  is  encoded  by 

R  =  [r  +  x0a  +  y®bl 

The  normalized  mean  of  binary  vectors  is  giv¬ 
en  by  bitwise  majority  rule,  with  ties  broken  at 
random.  An  important  property  of  all  HRRs  is 
that  merging  of  two  or  more  random  vectors 
produces  a  random  vector  that  resembles  each 
of  the  merged  vectors. 

Distiibutivity 

In  all  HRRs,  the  binding  and  unbinding 
operators  distributes  over  the  merging  opera¬ 
tor,  so  that,  for  example, 

(G  +  H  +  l]0a  =  (G0a  +  H0a  +  I0a] 
Distributivity  is  a  key  to  analyzing  HRRs. 

Probing 

To  find  out  whether  the  vector  a  appears 
bound  in  another  vector  R,  we  probe  R  with  a 
using  the  unbinding  operator.  For  example,  if 
R  represents  the  above  relation,  probing  it  with 
a  yields  a  vector  X  that  is  recognizable  as  x  (X 
will  retrieve  x  from  the  item  memory).  The 
analysis  is  as  follows: 

X  =  R0a  =  [r  +  x0a  +  y0b]0a 
which  becomes 

X  =  [r0a  +  (x0a)0a  +  (y0b)0a] 
by  distributivity  and  simplifies  to 

X  =  (r0a  +  X  +  y0h0a] 

Thus  X  is  similar  to  x;  it  is  also  similar  to  r0a 
and  y0b0a,  but  they  are  not  stored  in  the  item 
memory  and  thus  act  as  random  noise. 
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The  functions  described  so  far  are  sufficient 
for  traditional  symbol  processing,  for  example, 
for  realizing  a  Lisp-like  list-processing  system. 
Holistic  mapping,  which  is  discussed  next,  is  a 
parallel  alternative  to  what  is  traditionally  ac¬ 
complished  with  sequential  search  and  sub¬ 
stitution. 

Holistic  Mapping  and  Simple 
Analogical  Retrieval 

Probing  is  the  simplest  form  of  holistic 
mapping.  It  approximately  maps  a  corhposed 
pattern  into  one  of  its  bound  constituents,  as 
discussed  above  and  seen  in  the  following  ex¬ 
ample.  Let  F  be  a  holistic  pattern  representing 
France:  that  its  capital  is  Paris,  geographic  lo¬ 
cation  is  Western  Europe,  and  monetary  unit  is 
franc.  Denote  the  patterns  for  capital,  Paris, 
geographic  location.  Western  Europe,  money, 
and  franc  by  ca.  Pa,  ge,  WE,  mo,  andfr,  France 
is  then  represented  by  the  pattern 

F  =  [ca(8)Pa  +  ge®WE  +  mo®fr] 

Probing  F  for  “the  Paris  of  France**  is  done  by 
mapping  (XORing)  it  with  Pa  and  it  yields 

F(8)Pa  =  [ca  +  ge®WE(8)Pa  +  mo0fr®Pa] 

I 

(see  ‘Probing*  above)  and  is  approximately 
equal  to  ca: 

F®Pa  «  ca 

XORing  with  Pa  has  mapped  F  approxi¬ 
mately  into  ca,  meaning  that  Paris  is 
France* s  capital. 

Much  more  than  that  can  be  done  in  a  sin¬ 
gle  mapping  operation,  as  shown  in  the  follow¬ 
ing  two  examples.  Let  S  be  a  holistic  pattern 
for  Sweden  with  capital  Stockholm  (St),  locat¬ 
ed  in  Scandinavia  (Sc),  and  with  monetary  unit 
krona  (kr).  This  information  about  Sweden  is 
then  represented  by  the  pattern 

S  =  [ca(8>St  +  ge<8>Sc  +  mo®kr] 

We  can  now  ask  ‘What  is  the  Paris  of  Swe¬ 
den?’  If  we  take  the  question  literally  and  do 
the  mapping  S®Pa,  as  above,  we  get  nothing 
recognizable,  so  we  must  take  Paris  in  a  more 
general  sense.  ‘Paris  of  France*  gave  us  a  rec¬ 


ognizable  result  above  (i.e.,  approximately  ca), 
so  we  can  use  it:  we  can  map  S  (XOR  it)  with 
F(8)Pa  and  we  get 

S<8)F®Pa  «  St 

which  is  recognizable  as  the  pattern  for  Stock¬ 
holm.  The  derivation  is  based  on  distributivity 
and  is  similar  to  the  one  given  under  ‘Probing*. 
The  significant  thing  in  S(E)F®Pa  is  that  S<8>F 
can  be  thought  of  as  a  binding  of  two  composed 
patterns  of  equal  status,  rather  than  a  binding 
of  a  variable  to  a  value,  and  also  as  a  holistic 
mapping  between  France  and  Sweden,  capable 
of  answering  analogy  questions  of  the  kind 
‘What  is  the  Paris  of  Sweden?’  and  ‘What  is 
the  krona  of  France?’ 

Holistic  mapping  allows  multiple 
substitutions  at  once.  What  will  happen  to  the 
pattern  for  France  if  we  substitute  Stockholm 
for  Paris,  Scandinavia  for  Western  Europe, 
and  krona  for  franc,  all  at  once,  and  how  is 
the  substitution  done?  We  create  a  mapping 
pattern  as  above,  by  binding  the  correspond¬ 
ing  items  to  each  other  with  XOR  and  by 
merging  the  results: 

^  M=[Pa®St  + WE®Sc  +  fr®kr] 

Mapping  the  pattern  for  France  with  M 
then  gives 

F0M 

=  [ca®Pa  +  ge<8)WE  +  mo<8)fr] 

(8)  [Pa(8)St  +  WE®Sc  -f-  fr<8)kr] 

=  [  ca(8)Pa 

(8)  [Pa(8)St  +  WE0Sc  +  fr0kr] 
+  ge0WE 

0  [Pa0St  +  WE0Sc  +  fr0kr] 
+  mo0fr 

0  [Pa0St  +  WE0Sc  +  fr0kr]  ] 

by  distributivity,  which  becomes 

[  [ca0Pa0Pa0St  +  ca0Pa0WE0Sc 
+  ca0Pa0fr0kr] 

+  [ge0WE0Pa0St  +  ge0WE0WE0Sc 
+  ge0WE0fr0kr] 

+  [mo0fr0Pa0St  +  mo0fr0WE0Sc 
+  mo0fr0fr0kr]  ] 

again  by  distributivity.  That  simplifies  to 
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( [ca®St  +  ca®Pa0WE®Sc 
+  ca®PaS)fr0kr] 

+  [ge0WE0Pa0St  +  ge0Sc 
+  ge0WE0fr0kr] 

+  [iiio0fr0Pa0St  +  mo0fr0WE0Sc 
+  nio0kr]  ] 

and  is  recognizable  as 

[ca0St  +  ge0Sc  +  mo0kr] 

In  other  words, 

F0M«S 

so  that  a  single  mapping  operation  composed 
of  multiple  substitutions  changes  the  pattern  for 
France  to  an  approximate  pattern  for  Sweden, 
recognizable  by  the  clean-up  memory. 

TOWARD  A  NEW  MODEL 
OF  COMPUTING 

Holistic  representation  and  holistic  map¬ 
ping  hint  at  the  possibility  of  organizing  com¬ 
puting  around  analogy.  However,  the  examples 
that  I  have  shown  are  not  very  strong.  This  could 
mean  that  large  random  patterns  and  the  sug¬ 
gested  operations  on  them  are  not  a  good  way 
to  compute,  but  it  is  also  possible  that  they  are, 
but  that  we  are  not  using  them  correctly.  What 
stands  out  about  the  examples  is  that  they  are 
built  around  established  notions  of  variable, 
value,  property,  relation,  and  the  like.  These  are 
high-level  abstractions  that  help  us  describe 
abstract  things  to  each  other,  but  they  may  be 
poor  indicators  of  what  goes  on  in  the  brain. 
For  example,  should  a  pattern  for  a  variable, 
such  as  capital  city  in  the  above  examples,  be 
related  to  patterns  that  stand  for  individual  cit¬ 
ies,  and  how  should  those  be  related  to  the  pat¬ 
terns  for  the  countries  they  are  capitals  of? 
There  are  many  questions  to  answer  before  we 
can  decide  about  the  utility  or  futility  of  com¬ 
puting  with  large  patterns. 


What  is  appealing  about  large  random  pat¬ 
terns  is  that  they  have  rich  and  subtle  mathe¬ 
matical  properties,  and  they  lend  themselves  to 
parallel  computing.  Furthermore,  the  brain’s 
connections  and  patterns  of  activity  suggest  that 
kind  of  computing. 

For  a  computer  to  work  like  the  human  mind, 
it  must  be  extremely  flexible  in  its  use  of  sym¬ 
bols.  It  cannot  stumble  on  the  multiplicity  of 
meanings  that  a  word  can  have  but  rather  it  must 
be  able  to  benefit  from  the  multiplicity.  The  hu¬ 
man  mind  conquers  the  unknown  by  making 
analogies  to  that  which  is  known,  it  understands 
the  new  in  terms  of  the  old.  In  so  doing  it  creates 
ambiguity  or,  rather,  it  creates  rich  networks  of 
mental  connections  and  becomes  robust. 

My  hunch  is  that  after  we  understand  how 
the  brain  handles  analogy  -  how  it  treats  one 
thing  as  another  -  and  have  programmed  it  into 
computers,  programming  computers  to  handle 
language  will  be  an  easy  task,  but  it  will  not  be 
easy  before. 
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ABSTRACT 

NetAB,  a  two-part,  three-layered  feedfor¬ 
ward  neural  network  is  used  to  model  learning 
of  relations  and  their  application  in  discover¬ 
ing  solutions  to  analogy  problems.  Unlike  oth¬ 
er  models  of  analogy,  NetAB  allows  for  rela¬ 
tions  to  be  learned  and  generalised.  NetAB  was 
trained  and  tested  against  Rumelhart  and  Abra- 
hamson’s  (1973)  vector  model  of  analogical 
problem  solving,  and  against  human  solutions 
to  analogy  problems.  Ten  subjects*  similarity 
judgements  for  eighteen  animals  were  subject 
to  multidimensional  scaling,  creating  a  concep¬ 
tual  space  which  was  used  both  as  inputs  to 
NetAB,  and  for  calculating  solutions  for  the 
Rumelhart  and  Abrahamson  model.  The  results 
show  that  while  NetAB  models  Rumelhart  et 
al.’s  vector  model  favourably,  neither  model 
predicts  human  solutions  closely.  Possible  rea¬ 
sons  for  the  discrepancies  are  discussed. 

Analogical  reasoning  is  a  creative  thought 
process.  Discovery  by  analogy  occurs  when  a 
known  knowledge  domain  (the  base)  is  used  to 
create  concepts  in  a  new  domain  (the  target). 
The  problem  a:b  ::  c  :  ?  is  an  example  of  dis¬ 
covery  analogy  in  its  simplest  form.  In  this  pa¬ 
per,  we  illustrate  and  investigate  the  underly¬ 
ing  representational  processes  of  discovery 
analogy,  using  neural  networks. 

□  o 

Eduction  of  R«l«elona 
BasA 

Figure  L  Eduction  of  Relations  and  Eduction  of 
Correlates,  adapted  from  Spearman  (1923). 


Eduction  of  CorralatAB 
TatiraC 


Spearman  (1923)  described  analogy  as  com¬ 
prising  two  components:  Eduction  of  Relations 
and  Eduction  of  Correlates  (see  Figure  1). 

In  Spearman’s  model,  similar  to  the  later 
model  of  Sternberg  (1977),  the  relation  between 
the  objects  in  the  base  is  not  necessarily  pre¬ 
defined,  but  is  educed  during  analogical  r^ason- 
ing  (e.g.,  if  ‘cat*  and  ‘dog*  were  the  two  given 
objects  in  the  base,  then  any  or  all  of  the  relations 
‘same  size*,  ‘both  friendly  to  man*  and  ‘same  level 
of  domesticity*  might  be  educed.)  The  educed 
base  relation  is  then  used  actively  in  the  target 
domain  to  educe  the  unknown  element  from  the 
known  element  (e.g,  if  the  unknown  target  ele¬ 
ment  is  ‘horse*,  then  applying  any  of  the  listed 
relations  might  educe  ‘cow’  as  the  solution). 

In  Spearman’s  model  the  analogical  pro¬ 
cess  is  a  process  of  discovery,  and  relations  are 
active  agents  in  the  problem  solving  process. 
We  adapted  Spearman’s  model  to  a  connection- 
ist  framework  in  order  to  build  a  model  of  anal¬ 
ogy  based  on  the  the  three  following  principles: 

1.  Analogy  by  discovery  rather  than 
mapping:  While  much  emphasis  has  been 
placed  on  the  mapping  between  base  and  tar¬ 
get  objects  and  relations  (Handler  and  Cooper, 
1993;  Holyoak,  1989;  Forbus,  Centner,  &  Law, 
1995;  Hummel  &  Holyoak,  in  press),  some 
models  have  emphasised  the  discovery  aspects 
of  analogy  (Halford  etal.,  1993;  Mitchell,  1990; 
Plate  1993).  In  most  of  these  models,  however, 
relations  and  arguments  are  pre-defined  within 
some  knowledge  structure.  In  contrast,  the  cur¬ 
rent  work  is  aimed  at  modelling  discovery  anal¬ 
ogy  where  relations  can  be  learned  and  genera¬ 
lised  so  that  new  concepts  can  be  created  via 
the  analogy  process. 


171 


J.  E.  McCrcddcn 


2.  Relations  and  concepts  represented  in 
similar  ways:  To  allow  structures  to  be  repre¬ 
sented  (Centner  1983),  models  must  allow  nest¬ 
ing  of  relations  inside  of  higher  order  relations. 
Thus  concepts  and  relations  must  be  represent¬ 
ed  in  similar  ways  so  that  they  can  be  used  inter¬ 
changeably  inside  structures  (sec  Plate,  1991). 

3.  Relations  as  active  agents  In  process¬ 
ing:  Classical  representations  for  models  of 
analogy  require  both  a  knowledge  representa¬ 
tion,  and  a  set  of  processes  to  operate  on  the 
knowledge  base.  Neural  networks  differ  from 
propositional  representations  in  that  represen¬ 
tations  (i)  can  be  learned  and  (ii)  can  be  active 
components  in  information  processing.  For 
example  (Wiles,  Stewart’  &  Bloesch,  1990) 
showed  how  every  element  (object  or  relation) 
input  to  a  recurrent  net  is  an  active  operator  on 
the  concept  space  represented  by  the  hidden  unit 
space  (Elman,  1989).  Such  active  representa¬ 
tion  capabilities  for  relations  may  be  necessary 
for  modelling  “structural  alignment”  (Gold- 
stone,  1991 ),  which  demonstrates  how  relations 
act  as  powerful  operators  in  how  subjects  con¬ 
ceptualise  base  and  target  knowledge.  (Cent¬ 
ner  &Markman,  1993). 

In  order  to  model  discovery  analogy,  re¬ 
lations  must  be  modelled  as  active  entities 
such  that  they  can  be  (i)  created  in  the  base 
domain  and  (ii)  applied  in  the  target  domain 
to  create  new  concepts.  Two  neural  net  mod¬ 
els  which  can  be  viewed  as  modelling  these 
processes  are  HRR  (Plate,  1993)  and  STAR 
(Halford  et  al.,  1993). 

Currently,  there  is  a  limited  theoretical  ba¬ 
sis  for  describing  and  specifying  relations  both 
within  classical  and  conncctionist  systems.  In 
semantic  nets,  relations  are  represented  as  nodes, 
or  links  (QuilHan,  1 968),  and  in  productions  sys¬ 
tems,  they  are  propositions  represented  as  role- 
filler  structures  (Anderson,  1973).  In  neural  net 
models,  they  are  sometimes  treated  as  elements 
just  like  their  object  arguments  (e.g..  Handler  and 
Cooper,  1993).  In  these  models  there  is  often  a 
confusion  between  the  label  for  a  relation  (e.g. 
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‘largcr-than’)  and  the  relation  itself  (e.g.  the  re¬ 
lation  larger-than  involving  all  instances  of  one 
object  being  larger  than  another).  Furthermore, 
these  models  make  no  provision  for  explicit  rep¬ 
resentation  of  the  bindings  of  the  arguments  and 
labels  into  relational  instances. 

Representat  ion  of  Relalions 

The  mathematical  definition  of  relations  is 
used  by  Halford  and  Wilson  (1982)  to  specify 
relational  knowledge.  For  example  a  binary  rela¬ 
tion  is  specified  as  a  subset  of  ordered  pairs  from 
the  Cartesian  product  of  two  domains,  .  Given 
this  definition,  a  model  of  relations  must  have  the 
ability  to  represent  both  the  overall  relation  and 
each  instance  .  Halford  et  al.  (1987)  suggest  that 
representations  of  relations  must  comprise  a  vec¬ 
tor  for  each  argument,  a  vector  for  the  label ,  and  a 
representation  of  the  binding. 

Representation  of  binding  is  central  to  rep¬ 
resentation  of  relations  (see  Hinton,  1986). 
Halford  et  al.  (1997)  provide  a  classification 
system  for  models  of  relations  in  terms  of  the 
type  of  bindings  used;  i.e.,  role-filler  or  argu¬ 
ment-argument,  and  the  type  of  architecture 
used  for  the  binding;  i.e.,  tensor  products  (Hal¬ 
ford  et  al.,  1993;  Smolensky,  1990),  convolu¬ 
tion  correlation  (Plate,  1991),  and  synchronous 
oscillation  (Hummel  &  Holyoak.  in  press;  Shas- 
tri  &  Ajjanagadde,  1993). 

Bindings  can  also  be  represented  in  the  hid¬ 
den  layer  of  a  neural  network,  (Hinton,  1986). 
Furthermore,  Elman  (1989)  showed  how  bind¬ 
ings  are  organised  into  meaningful  regions  in 
recurrent  networks  (discovered  using  principle 
components  analysis).  Related  work  on  bind¬ 
ing  and  representational  structure  in  the  hid¬ 
den  layer  of  recurrent  nets  (Wiles,  Stewart,  and 
Bloesch,  1990)  and  on  structure  in  hidden  unit 
representations  in  simple  feedforward  nets 
(Wiles  1993,  Wiles  and  Ollila,  1993)  has  shown 
that  bindings  can  be  represented  in  the  hidden 
layer  of  a  feedforv^^ard  net.  Furthermore,  these 
representations  provide  some  knowledge  struc¬ 
ture  within  their  spatial  organisation,  such  as 
hierarchies,  discrete  regions,  and  intersecting 
regions  (Wiles,  1993b). 
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Spatial  systems  have  been  used  to  map  con¬ 
ceptual  similarity  spaces  for  many  domains. 
Multidimensional  scaling  techniques  developed 
by  Shepard  (1962)  have  been  used  to  map  out 
spaces  such  as  the  animal  knowledge  domain 
(Henley,  1969;  Rips  et  al,,  1973).  Similarity 
spaces  provide  non-propositional  representa¬ 
tions  for  objects,  where  conceptual  similarity 
is  represented  by  spatial  distance. 

Spatial  systems  may  be  extended  to  repre¬ 
sentations  of  relations  as  well  as  objects.  This 
possibility  has  been  exploited  in  the  work  of 
Rumelhart  and  Abrahamson  (1973),  who  used 
points  from  Henley’s  (1969)  animal  space  to 
represent  animal  arguments  to  the  similarity 
relation,  and  the  vector  difference  between 
those  points  to  represent  the  similarity  relation 
between  animals.  Rumelhart  and  Abrahamson 
used  the  relation  vector  in  a  full  model  of  anal¬ 
ogy,  described  later. 

Given  that  (i)  bindings  are  crucial  to  mod¬ 
els  of  relations,  (ii)  that  binding  regions  are  cre¬ 
ated  in  the  hidden  unit  space  of  feedforward 
nets,  and  that  (iii)  relative  spatial  position  has 
been  used  to  model  relations,  we  suggest  that 
perhaps  a  feedforward  net  can  be  used  to  learn 
bindings  that  represent  relative  spatial  position 
of  concepts  from  a  semantic  space.  Further¬ 
more,  we  suggest  that  these  bindings  may  be 
utilised  by  a  further  net  to  solve  analogy  prob¬ 
lems.  We  explored  this  possibility,  using  ani¬ 
mal  knowledge  space  and  Rumdhart  and  Abra- 
hamson’s  (1973)  model  of  relations  and  analo¬ 
gy,  to  design  a  network  architecture  as  follows. 

The  Theoretical  Mechanism:  Rumelhart 
and  Abrahamson ’s  (1973)  vector  model  of  re¬ 
lations  and  analogy  was  used  as  the  theoretical 
basis  for  the  network.  Using  Henley’s  three  di¬ 
mensional  Euclidean  space,  Rurhelhart  et  al. 
showed  that  analogical  problem  solving  could 
be  modelled  using  vector  subtraction  and  addi¬ 
tion.  That  is,  in  the  problem  a  :  b  ::  c  :  ?  the 
relation  between  the  two  points  in  animal  space 
(a  and/? )  can  be  calculated  as  the  vector  dif¬ 
ference  between  the  two  points.  Then  the  vec¬ 
tor  can  be  applied  to  the  known  point  in  the 
target  domain  (c)  to  find  the  solution.  Thus, 
Rumelhart  and  Abrahamson  propose  that  the 
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Figure  2.  The  parallelogram  model  of  Rumelhart  and 
Abrahamson  as  applied  to  the  problem  *Cat  is  to  Dog  as 
Tiger  is  to  What?* 

solution  to  an  analogy  (5)  can  be  calculated 
as:  S  =  (Z?  -  a)  +  c .  Because  the  geometrical 
shape  of  this  formula  in  Euclidean  space  is  a 
parallelogram,  we  call  it  the  ‘parallelogram 
model’  of  analogy. 

Figure  2  shows  the  parallelogram  model 
applied  to  the  analogy  problem  ‘Cat  is  to  Dog 
as  Tiger  is  to  What?’,  where  concepts  are  rep¬ 
resented  by  coordinates  in  Henley’s  (1969)  an¬ 
imal  space.  The  composite  relation  Dog  big- 
ger-than  Cat  &  Dog  same-ferocity  as  Cat  & 
Dog  less-human-than  Cat  is  calculated  using 
the  vector  difference  Dog  -  Cat  and  is  added  to 
the  coordinates  for  Tiger,  resulting  in  the  coor¬ 
dinates  representing  the  ideal  solution  to  the 
analogy,  /.  Rumelhart  and  Abrahamson  propose 
that  the  nearest  animal  to  the  ideal  solution,  in 
this  case  Wolf  is  then  given  as  the  solution  to 
the  problem. 

To  model  the  parallelogram  model  in  a 
network  architecture,  we  first  needed  to  con¬ 
struct  animal  knowledge  space  for  a  set  of  hu¬ 
man  subjects,  to  use  the  points  in  space  as  in¬ 
puts  to  the  net,  and  then  obtain  human  solu¬ 
tions  for  analogy  problems  from  within  that 
space  against  which  the  net  could  be  tested. 

The  Problem  Domain:  Human  Data: 
Conceptual  animal  space  was  first  mapped  out 
for  each  subject  using  a  similarity  judgement 
task  similar  to  Henley  (1969).  That  is,  for  each 
of  1 8  animals  chosen  for  the  problem  domain, 
ten  subjects  rank-ordered  the  similarity  of  all 
other  animals  to  that  animal.  These  judgements 
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were  then  subject  to  multidimensional  scaling, 
such  that  three  dimensions  emerged.  Similar  to 
Henley’s  dimensions,  they  were  labelled  ‘size’, 
‘ferocity’,  and  ‘domesticity’.  Next,  subjects 
were  given  30  randomly  created  analogy  prob¬ 
lems  and  the  solutions  recorded. 

A  feedforward  neural  net,  NetAB  was  de¬ 
veloped  to  model  relations  over  the  construct¬ 
ed  animal  knowledge  space,  and  to  model  anal¬ 
ogy  using  the  parallelogram  model  as  the  basis 
for  the  architecture.  The  net  comprised  two 
parts.  The  first  part,  (NetB)  was  designed  to 
model  the  Eduction  of  Relations  mechanism, 
and  the  second  part  (NetA)  was  designed  to 
model  Eduction  of  Correlates.  Consequently, 
the  experimental  design  for  the  network  also 
had  two  parts.  First,  for  each  subject,  concep¬ 
tual  space  was  constructed  and  used  to  train  the 
first  part  of  the  net  with  relations  from  that  sub¬ 
ject’s  conceptual  space.  Next,  the  second  part 
of  the  net  was  trained  to  access  the  relations  to 
make  identity  mappings  (see  below).  Then  the 
Rumelhart  and  Abrahamson  analogy  solutions, 
the  net’s  analogy  solutions  and  human  analogy 
solutions  could  be  compared. 

Using  this  experimental  design,  NetAB  was 
designed  and  tested  as  follows: 

NETAB;  APPLICATION  OF  BINDING 

NetAB  comprised  two  nets:  (i)  NetB  for 
representing  relations,  arguments,  bindings  of 
relational  instances,  and  relation  labels,  and  (ii) 
NetA  for  representing  application  of  bindings, 
arguments,  and  new  concepts  discovered.  An 
outline  sketch  of  NetAB  is  shown  in  Figure  3. 

In  order  to  test  the  NetAB  model,  two  cri¬ 
teria  for  correctness  were  used.  Firstly,  the  par¬ 
allelogram  formula  was  used  as  the  criterion 
against  which  to  test  the  goodness  of  the  Ne¬ 
tAB  model.  That  is,  we  investigated  whether 
NetB  could  learn  to  represent  relations  between 
animal  pairs(£z,fc)  as  the  vector  differenc¬ 
es  (fc  -n)  in  hidden  unit  space,  and  whether 
NetA  could  output 5  =  (fc  —  e?)  +  c  as  the  so¬ 
lution  to  the  analogy.  Secondly,  human  con¬ 
ceptual  space  was  used  to  con.struct  relations 
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Figure  3.  SetAF,  comprising  NetH,  the  binding 
mechanism,  and  NetA,  the  appHeafhn  mechanism. 

for  training  NetB,  and  human  analogy  solutions 
were  used  to  test  the  goodness  of  NetAB  as  a 
model  of  human  analogical  reasoning. 

Net  B:  The  Binding  Net 

NetB  was  the  Eduction  of  Relations  net.  It 
had  six  input  units,  six  output  units,  and  six  hid¬ 
den  units,  as  follows: 

Inputs:  Each  animal  in  a  subject’s  concep¬ 
tual  space  was  represented  by  a  vector  of  coor¬ 
dinates  along  each  of  the  three  dimensions.  In¬ 
puts  for  NetB  were  pairs  of  animal  vectors  from 
a  subject’s  conceptual  space,  normalised  to  lie 
in  the  range  [+1,-1].  An  example  of  an  input 
set  for  subject  I  would  be  ‘kangaroo,  koala’  rep¬ 
resented  as  (.7 1 ,  -.43,  -.7 1  1 .5 1 ,  1 .0,  .28). 

Outputs:  Outputs  were  vectors  represent¬ 
ing  the  labels  for  the  relations  between  the  in¬ 
put  pairs  along  each  dimension  (size,  ferocity, 
and  domesticity).  The  labels  grcatcr-than  (+1, 
-! ),  equals  ( -1 ,  +1 ),  and  less-than  (+1 ,  + 1 )  were 
chosen  such  that  they  had  no  mathematical  re¬ 
lationship  to  the  vector  difference  between  the 
two  input  pairs,  so  as  to  ensure  symbolic,  or 
O’ representations  of  labels  (McCredden, 
1995b).  An  example  of  an  output  set  for  sub¬ 
ject  I  corresponding  to  the  inputs  ‘kangaroo, 
koala’,  would  be  ‘greater-than  (size),  less-than 
(ferocity),  less-than  (domesticity)’,  represent¬ 
ed  as  (-1  +1  1+1  +11+1  +1). 

Hidden  Units:  Two  dimensions  (two  hid¬ 
den  units)  were  required  for  each  relational  di¬ 
mension  (sizxj,  ferocity,  domesticity),  making  six 


174 


NetAB;  A  Neural  Network  Model  of  Analogy  by  Discovery 


hidden  units  in  total.  This  decision  was  based 
on  previous  work  (McCredden,  1995a)  which 
showed  that  if  the  hidden  layer  was  only  given 
one-dimension  in  which  to  represent  relational 
bindings,  the  spatial  location  of  bindings  was 
directly  related  to  the  vector  difference  between 
the  inputs,  thus  permitting  only  non-arbitrary, 
non-symbolic  representations  of  relations. 

Training  and  Testing:  With  18  animals  in 
the  problem  domain,  and  three  relational  dimen¬ 
sions  for  each  pair  of  animals,  there  were  972 
possible  input-output  mappings.  In  order  to  test 
for  generalisation,  the  training-testing  schedule 
chosen  was  to  train  with  70%  of  the  input-out- 
put  pairs  (N=678),  and  to  test  with  the  other  30% 
of  unseen  relations  (N=294).  For  each  of  ten  sub¬ 
jects,  NetB  was  run  five  times,  using  a  random 
selection  of  animal  pairs  for  training  and  test¬ 
ing.  The  criterion  for  evaluating  the  performance 
of  the  net  was  whether  or  not  NetB’s  outputs 
were  on  the  same  side  of  zero  as  the  expected 
output.  For  example,  if  the  two  inputs  were  ‘ko¬ 
ala,  koala’,  the  output  size  relation  (-1,  +1)  i.e. 
‘equals’  would  have  been  expected.  In  this  case, 
a  NetB  output  of  (-.02,  0.8)  would  have  been 
classified  as  correct  while  an  output  of  (.02, 0.8) 
would  have  been  classified  as  incorrect. 

Results 

Table  1  shows  the  mean  total  sum  of  squares 
for  NetB  outputs,  and  the  mean  number  of  in¬ 
correct  responses  (for  five  simulations  for  each 
subject,  averaged  over  ten  subjects,  rounded  to 
integers,  and  converted  to  percentages  of  the  to¬ 
tal  test  set).  The  table  shows  that  NetB  learned 
the  relations  fairly  well,  with  few  errors.  Further 
inspection  of  the  outputs  showed  that  most  er¬ 
rors  occured  for  pairs  that  had  small  differences 
such  that  NetB  incorrectly  classified  them  as 
equals  or  as  a  combination  of  either  equals  and 
less-than  or  of  equals  and  greater-than.  General¬ 
isation  to  untrained  relations  was  good  with  sim¬ 
ilar  types  of  errors  to  the  training  set. 

NetB  demonstrates  how  a  three-layered 
feed-forward  net  is  capable  of  learning  labels 
for  relations  between  pairs  of  animals  repre¬ 
sented  by  points  in  conceptual  space.  The  hid- 


Test  Set 

TSS 

s.d. 

Inc. 

s.d. 

Trained 

2 

3 

1 

1 

Untrained 

16 

8 

5 

2 

Table  I,  The  average  total  sum  of  squares  (%)  and 
incorrect  classifications  (%)  of  relations. 


den  units  of  NetB  were  investigated  further  to 
see  how  this  learning  occured. 

For  each  input  pair,  in  order  to  label  the 
relation  along  each  dimension  correctly  for  the 
outputs,  the  net  needed  only  to  classify  the  dif¬ 
ference  on  the  given  dimension  into  one  of  three 
categories  (greater-than,  equals,  or  less-than). 
Investigations  of  the  weights  to  the  hidden  units 
showed  that  in  NetB,  these  categories  were  rep¬ 
resented  by  three  regions  in  two  dimensional 
space.  In  general,  each  net  used  two  hidden  units 
to  represent  bindings  for  each  of  the  size,  fe¬ 
rocity,  and  domesticity,  though  there  was  some 
overlap  due  to  correlations  between  relational 
dimensions  (i.e.,  if  animal  <3:  is  larger  than  ani¬ 
mal  ft  it  is  often  more  ferocious  as  well.)  Fig¬ 
ure  4  depicts  a  case  where  two  hidden  unit  di¬ 
mensions  coded  for  the  size  relation  fairly  clear¬ 
ly.  The  three  binding  regions  created  by  the  net 
for  greater-than,  equals,  and  less-than  were  then 
classified  and  transformed  into  the  appropriate 
output  labels  by  the  hyperplanes  defined  by  the 
weights  and  biases  to  the  ouput  units. 

Figure  4  shows  the  greater  than,  less-than, 
and  equals  regions  are  separated  and  arbitrari¬ 
ly  placed  within  the  space.  Furhter  analysis  of 
these  regions  for  various  size  relations  has 
shown  that  the  binding  regions  for  the  small 
less-than  and  smaller  greater-than  relations  are 
closer  to  the  equals  region  than  the  binding  re¬ 
gions  for  the  large  relations  (though  such  a  spa¬ 
tial  layout  does  not  always  occur  or  is  not  al¬ 
ways  obvious)^ . 


*  The  procedure  for  mapping  out  the  regions  in  hid¬ 
den  unit  space  was  repeated  for  several  simulations  until  a 
clear  hidden  unit  spatial  representation  was  found.  Other 
factors  such  as  correlations  between  relational  dimensions 
may  be  affecting  the  placement  of  bindings.  These  are  cur¬ 
rently  being  inve.stigated  using  principle  components  anal¬ 
ysis,  and  will  be  reported  in  future  work. 


175 


J.  E.  McCrcdden 


I  hyp?^rplan*“  2 

hu2 

Figure  4.  Binding  reglom  for  the  >,  *,  and  <  relations 
along  the  size  dimension  for  a  particular  run  of  NetB. 

Further  inspection  of  the  weights  and  bi¬ 
ases  to  the  hidden  units  show  why  spatial  or¬ 
ganisations  occured.  In  the  parallelogram 
model,  a  relation  between  two  inputs^?  and^ 
is  calculated  as  {b-a).  In  NetB  however,  the 
bindings  were  calculated  as  some  ordered 
measure  of  (fc  -  a) ,  (denoted  as  Ord  {h-a)) 
where  the  absolute  value  of  the  vector  differ¬ 
ence  was  lost,  but  the  relative  values  remained 
such  that  small  vector  differences  gave  hid¬ 
den  unit  values  closer  to  zero,  while  large  vec¬ 
tor  differences  gave  hidden  unit  values  closer 
to  +/- 1 .  That  is,  the  parallelogram  model  keeps 
ratio  information  about  the  relationship  be¬ 
tween  two  animals,  while  NetB  keeps  only 
ordinal  information. 

ANALYSIS 

NetB  takes  perceptual-like  representa¬ 
tions  as  inputs  and  outputs  symbolic  labels 
for  the  relationships  between  inputs.  NetB 
embodies  both  the  animal  relations  as  a 
whole,  and  for  each  instance  of  the  relation, 
creates  arbitrary  yet  information-rich  bind¬ 
ings  represented  by  relative  position  in  hid¬ 
den  unit  space.  Unlike  previous  models  of 
analogy  the  representations  are  learned  and 
can  generalise  to  unseen  input  pairs.  Further¬ 
more,  bindings  are  explicitly  represented 
during  the  problem,  such  that  they  can  be 
learned  and  utilised  by  further  processes.  This 
gives  NetAB  an  advantage  over  the  parallel¬ 
ogram  model  where  the  relations  are  calcu¬ 


lated  and  discarded.The  bindings  represent¬ 
ed  in  NetB  are  used  to  solve  analogy  prob¬ 
lems  by  being  utilised  by  a  further  net,  NetA, 
as  described  below. 

NetA:  The  Application  Net 

NetA  was  designed  to  implement  the  sec¬ 
ond  part  of  Spearman’s  model  of  analogy  (Educ¬ 
tion  of  Correlates),  and  the  second  part  of  the 
parallelogram  model,  which  for  the  analogy 

a:b c :  ?  would  beS  =  (fc  -  a)  +  c. 

Inputs:  The  inputs  to  NetA  were  (i)  the  NetB 
hidden  unit  vector  representing  the  binding  of  a 
relation  between^  and/?  in  the  base,  and  (ii) 
the  vector  for  the  target  animal  c .  An  example 
of  an  input  set  to  NetA  for  the  ‘kangaroo ;  koala 
::  zebra  :  ?’  analogy  would  be  the  hidden  unit 
vector  for  the  kangaroo  koala  binding,  (-.2 1 ,  .7 1 , 
-111,1,1),  and  the  vector  representing  the  tar¬ 
get  element  zebra,  (.89,  -.57,  -.62). 

Outputs:  The  output  for  NetA  was  a  three 
dimensional  vector  representing  a  hypothetical 
animal  in  conceptual  space  (Rumelhart  and 
Abrahamson’s  ‘ideal’  solution,  /.)  It  was  as¬ 
sumed  that  some  cleanup  mechanism  would 
settle  on  a  solution  which  produced  the  animal 
in  conceptual  space  closest  to  this  point  but  this 
mechanism  was  not  simulated.  For  example,  in 
the  ‘kangarootkoala  ::  zebra  :  ?’  analogy,  the 
ouput  would  be  (-.  1 6, .  1 5,  -.07),  where  the  clos¬ 
est  animal  in  conceptual  space  to  this  ideal  so¬ 
lution  might  be  ‘goat*. 

Training  NetAB:  NetB  was  combined 
with  NetA  to  create  NetAB,  which  was  trained 
to  map  the  base  relation  to  the  target  domain. 
NetA  was  not  trained  on  analogy  problems,  but 
on  the  simplest  form  of  mapping:  i.e.,  the  iden¬ 
tity  relation.  Thus  NetAB  was  trained  with  two 
animal  inputs  to  NetB,  a  relation  label  output 
for  NetB,  an  animal  input  to  NetA,  and  a  hypo¬ 
thetical  animal  output  for  NetA.  For  example, 
NetAB  would  have  been  trained  on  (kangaroo, 
koala  I  greater-than,  less-than,  less-than)  as  in¬ 
puts  and  outputs  to  NetB,  and  (kangaroo  I  ko¬ 
ala)  as  input  and  output  to  NetA.  NetAB  was 
trained  and  tested  using  the  same  selection  of 
training  pairs  used  for  training  NetB. 
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Testing  NetAB:  After  testing  for  the  abil¬ 
ity  to  map  identity  relations,  NetAB  was  tested 
for  the  ability  to  map  any  relations  from  the 
base  to  the  target.  Firstly  NetAB  was  tested  with 

(i)  the  trained  identity  mappings  (N=226)  and 

(ii)  the  untrained  identity  mappings  (N=98). 
Secondly,  NetAB  was  tested  for  an  ability  to 
generalise  the  mapping  task  to  analogy  prob¬ 
lems,  so  that  it  was  tested  with  (iii)  analogies 
involving  relations  which  NetB  had  been 
trained  with  (N=30,  randomly  selected),  (iv) 
analogies  involving  relations  which  NetB  had 
not  been  trained  with  (N=30,  randomly  select¬ 
ed),  and  (v)  analogies  involving  relations  which 
NetB  had  not  been  trained  with,  but  which  hu¬ 
mans  had  been  given  (N=30),  in  order  to  com¬ 
pare  NetAB  with  human  solutions. 

Results 

NetAB  produced  solutions  which  were 
points  in  three  dimensional  space,  represent¬ 
ing  a  hypothetical  animal  in  a  subject’s  con¬ 
ceptual  space.  The  results  for  the  five  different 
tests  of  NetAB  (where  the  number  incorrect  was 
the  number  of  solutions  lying  more  than  0.5 
away  from  the  expected  solution)  are  shown  in 
Table  2. 

For  the  analogies  which  were  presented 
to  both  humans  and  the  net,  three-way  com¬ 
parisons  were  made  between  NetAB  solu¬ 


iTest  Set 

av.  inc. 

iTrained  (B)  Identity 

25 

34 

23 

HlBi 

[Untrained  (B)  Analogy 

23 

human  Analoev 

30 

—JO  1 

tions  (AB),  Rumelhart  and  Abrahamson’s  so¬ 
lutions  (RA),  and  human  solutions  (H).  The 
results  of  these  comparisons  are  summarized 
in  Table3  . 

Using  as  the  criterion  for  correctness  that 
solutions  lay  within  0.5  of  one  another,  Table 
3  shows  (i)  how  many  of  the  solutions  from 
each  system  were  incorrectwhen  compared 
with  the  solutions  from  the  othel*  systems,  (ii) 
if  thesolutions  were  allowed  to  converge  to  the 
nearest  existing  animals  inthe  space,  how 
many  were  correct  with  respect  to  one  anoth¬ 
er,  and  (iii)  if  the  solutions  were  allowed  to 
converge,  how  many  were  identical  across  the 
three  systems.  The  results  presented  are  the 
means  of  five  simulations  for  each  subject, 
averaged  over  ten  subjects  rounded  to  integers, 
and  converted  to  percentages. 

ANALYSIS 

Table  3  shows  that  NetAB  is  able  to  utilise 
the  bindings  from  NetB  and  apply  them  so  as  to 
learn  the  identity  mapping,  both  for  relations  it 
has  seen  before  and  relations  it  has  not  seen  be¬ 
fore,  with  about  a  70\%  success  rate.  Once  it  has 
learned  to  map(a  I  b)  in  the  base  onto(a  I  b) 
in  the  target,  NetAB  can  then  do  any  (including 
analogical)  mapping,  and  gives  good  results  for 
analogies  based  on  relations  it  has  not  been 
trained  with,  as  well  as  on  trained  relations. 


1  %  incorrect  | 

1  RA/AB 

1  RA/H 

AB/H  1 

av. 

c.d. 

av. 

s.d. 

av. 

s.d. 

30 

10 

87 

7 

83 

7 

1  CRA/CAB 

CRA/H 

CAB/H  1 

av. 

av . 

s.d. 

av. 

s.d. 

17 

■■ 

77 

10 

^70 

17 

1  %  identical  | 

av. 

s.d. 

av: 

s.d. 

av. 

IBI 

13 

5 

4 

1 

3 

1  i 

Table  2.  The  average  incorrect  classifications  (%)  of  Table  3,  The  average  incorrect  (%)  and  identical 

identity  and  analogical  mappings  made  by  NetAb  for  solutions  (%)  when  the  three  models  were  compared 

relations  trained  and  untrained  by  NetB,  (RA  =  the  parallelogram  model,  AB  =  NetAB,  H  = 

Human,  CRA  =  closest  to  RA  solution,  CAB  =  closest  to 
NetAB  solution). 
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The  net’s  performance  is  good  when  the 
parallelogram  solution  is  used  as  the  criterion 
for  correctness  (which  is  not  surprising  since  the 
parallelogram  formula  was  used  as  the  criterion 
for  training  NetB  and  NetAB).  However,  when 
the  parallelogram  model  and  the  NetAB  model 
are  compared  with  human  data,  neither  give  good 
results.  Table  3  shows  that  while  NetAB  and 
Rumelhart  and  Abrahamson  solutions  are  simi< 
lar,  neither  converge  to  solutions  which  are  iden¬ 
tical  to  human  solutions  very  often. 

Further  investigation  of  this  result  was  done 
by  looking  at  the  correlations  of  the  base  vec¬ 
tors  (the  vector  difference  between  the  target 
element,  c  and  the  solution  for  the  given  mod¬ 
el)  between  each  of  the  models,  along  each  di¬ 
mension  (size,  ferocity,  and  domesticity).  The 
correlations  would  indicate  whether  the  solu¬ 
tions  for  each  model  were  alike,  or  very  differ¬ 
ent.  The  correlations,  averaged  across  all  sim¬ 
ulations,  are  summarized  in  Table  4. 

The  table  shows  that  the  correlations  be¬ 
tween  the  base  vectors  for  the  paralleogram 
model  and  for  NetAB  were  good,  middling  for 
human  data  versus  NetAB,  and  less  for  human 
data  virsus  parallelogram  solutions.  In  addition, 
the  correlations  were  better  between  human  data 
and  both  models  for  the  size  dimension. 

The  significance  of  the  size  correlations 
suggests  a  rea.son  for  the  limitations  of  the  par¬ 
allelogram  model  (and  subsequently  for  the 
NetAB  model)  of  analogical  reasoning.  It  could 
be  that  in  human  judgements,  size  is  the  most 
salient  dimension  (as  illustrated  by  the  multi¬ 
dimensional  scaling  results),  and  that  size  is 
often  used  to  make  judgements  about  the  base 
relation  in  analogical  reasoning  regarding  ani¬ 
mals.  When  this  occurs,  all  models  will  give 
similar  solutions  along  the  size  dimension. 


1  RA/AB 

RA/H 

AB/H  1 

s 

f 

d 

s 

f 

d 

s 

f 

d 

.9 

.9 

.8 

.5 

.3 

,2 

.6 

.5 

.4 

Table  4.  Correlations  between  (he  base  vectors  for  the 
parallelogram  model  (RA),  NetAB  (AB),  and  the  human 
data  (H),  averaged  across  all  simulations  for  the  three 
relational  dimensions:  size  (s),  ferocity  (f,  and 
domesticity  (d). 


However,  if  other  dimensions,  not  present  in 
the  restricted  three  dimensional  representations 
are  used,  then  humans  will  give  solutions  far 
away  from  those  predicted  by  cither  model.  If 
this  is  the  case,  then  both  the  Rumelhart  and 
Abrahamson  model  and  NetAB  need  to  be  ad¬ 
justed  so  as  to  be  able  to  represent  relations 
along  all  conceptual  dimensions  and  to  be  able 
to  educe  the  salient  relation  from  amongst  all 
possibilities  in  order  to  apply  it  to  the  target. 

DISCUSSION 

NetAB  has  been  used  to  illustrate  how  dis¬ 
covery  by  analogy  can  be  viewed  as  comprising 
two  component  processes:  Eduction  of  Relations 
and  Eduction  of  Correlates.  NetAB  represents 
relations  such  that  they  can  be  learned  and  gen¬ 
eralised.  The  model  can  solve  analogy  problems 
for  both  seen  and  unseen  relational  instances. 

NetAB  represents  bindings  in  an  arbitrary 
yet  information  rich  concept  space.  Bindings 
arc  created  on  the  run  during  analogy,  then  ap¬ 
plied  to  a  new  domain  to  discover  a  solution  to 
the  problem.  Bindings  allow  pcrccptual-like 
representations  of  pairs  of  animal  concepts  to 
be  classified  into  symbolic-like  categories  (e.g. 
‘greater-than’)  without  losing  the  ability  to  gen¬ 
eralise  to  unseen  instances. 

While  the  accuracy  of  NetAB  with  respect 
to  human  solutions  is  limited  at  this  stage,  the 
processes  embodied  in  NetAB  illustrate  how 
discovery  analogy  may  be  modelled  using  fccd- 
forw'ard  nets. 
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ABSTRACT 

This  paper  claims  that  higher  cognition 
implemented  by  a  connectionist  system  will 
be  essentially  analogical,  with  analogical  map¬ 
ping  by  continuous  systematic  substitution  as 
the  core  cognitive  process.  The  centrality  of 
analogy  is  argued  to  be  necessary  in  order  for 
a  connectionist  system  to  use  representations 
that  are  effectively  symbolic.  In  turn,  these 
representations  are  argued  to  be  a  necessary 
consequence  of  a  sequence  of  broad  design  de¬ 
cisions  needed  to  address  technical  problems 
in  adapting  a  connectionist  system  for  higher 
cognition.  The  design  decisions  are  driven  by 
the  demands  of  a  paradigmatic  cognitive  task 
and  the  desire  to  remain  faithful  to  the  con¬ 
straints  of  connectionist  components.  Thus, 
the  argument  explains  the  origin  of  symbolic 
representations  and  analogy  as  necessary  con¬ 
sequences  of  task  demands  and  connectionist 
processing  capabilities. 

INTRODUCTION 

One  of  the  more  persistent  problems  in  cog¬ 
nitive  science  is  the  reconciliation  of  the  emer¬ 
gent  functional  properties  of  human  cognition 
with  the  apparently  much  more  limited  func¬ 
tional  capabilities  of  the  neural  systems  that  im¬ 
plement  them.  Higher  cognition  has  been  most 
successfully  modelled  in  terms  of  symbolic 
computations  that  appear  implausibly  difficult 
to  implement  neurally.  On  the  other  hand,  con¬ 
nectionist  systems  (the  currently  favoured  par¬ 
adigm  for  modelling  the  presumed  computa¬ 
tional  processes  of  neural  systems)  appear  to 
be  neurally  implementable  but  far  less  capable 


than  symbolic  systems  of  implementing  the  de¬ 
sired  cognitive  functions. 

Despite  the  relative  success  of  symbolic 
computation  the  shortcomings  of  classical  Ar¬ 
tificial  Intelligence  (based  on  symbolic  com¬ 
putation)  suggest  that  simply  scaling  up  the  size 
and  speed  of  current  symbolic  systems  will  not 
yield  the  desired  cognitive  functions.  One  re¬ 
sponse  to  this  situation  is  to  focus  on  building 
connectionist  systems.  This  action  is  based  on 
taking  human  cognition  as  an  existence  proof 
for  the  possibility  of  implementing  higher  cog¬ 
nition  with  a  connectionist  system. 

Some  researchers  have  implemented  clas¬ 
sic  symbolic  architectures  in  connectionist  sys¬ 
tems.  For  example,  Touretzky  and  Hinton 
(1988)  built  a  Distributed  Connectionist  Pro¬ 
duction  System.  We  have  chosen  not  to  fol¬ 
low  this  approach  of  implementing  known 
symbolic  processes  because  we  believe  it  will 
be  bound  by  the  limitations  of  current  sym¬ 
bolic  models. 

The  problem  of  attempting  to  find  a  con¬ 
nectionist  architecture  with  the  desired  cog¬ 
nitive  properties  can  be  cast  as  one  of  efficient¬ 
ly  searching  design  space.  Given  the  vast  num¬ 
ber  of  potential  connectionist  systems  we  need 
a  strategy  to  guide  our  exploration  of  designs. 
We  have  chosen  to  be  guided  by  the  constraints 
imposed  by  connectionist  computational  ele¬ 
ments  and  the  problems  to  be  solved  by  high¬ 
er  cognition.  By  remaining  true  to  the  connec¬ 
tionist  raw  material  we  hope  to  allow  solutions 
that  are  obscured  by  taking  symbolic  opera¬ 
tions  as  the  primitive  functions  of  processing. 
If  it  turns  out  that  the  emergent  properties  of 
such  a  connectionist  system  may  be  charac¬ 
terised  as  symbolic,  then  that  is  further  evi- 
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dence  for  the  plausibility  of  the  system  (given 
the  success  of  symbolic  models  -  but  not  sym¬ 
bolic  implementations). 

In  this  paper  our  strategy  for  exploration 
of  design  space  may  be  summarised  by  the 
question:  Starting  with  a  simple  conneciionist 
system;  what  minimal  design  decisions  might 
lead  to  a  capability  for  higher  cognition*’ 

This  question  arises  from  an  evolutionary 
stance.  It  is  taken  as  given  that  higher  cogni¬ 
tion  is  the  function  of  an  artefact  built  from 
neural  components  and  designed  through  evo¬ 
lution  to  solve  certain  survival  problems.  Giv¬ 
en  the  conservative  nature  of  evolutionary  de¬ 
sign  and  that  conncctionism  is  an  appropriate 
model  of  neural  computation  we  believe  that  a 
sequence  of  minimal  modifications  starting 
from  the  simplest  connectionist  system  will  be 
sufficient  to  yield  a  system  with  the  desired 
cognitive  capabilities. 

This  paper  suggests  a  series  of  design 
problems  and  broad  design  approaches  to 
their  solution.  (We  aspire  to  precise,  imple- 
mentablc  design  choices,  but  that  is  work  in 
progress.)  Since  the  modifications  to  the  con¬ 
nectionist  architecture  are  constrained  to  be 
minimal,  the  hope  is  that  the  resultant  archi¬ 
tecture  will  be  practically  implementable. 
Furthermore,  we  argue  that  the  symbolic 
properties  of  higher  cognition  and  the  cen¬ 
trality  of  analogy  to  cognition  arise  as  neces¬ 
sary  consequences  of  the  design  decisions 
motivated  by  connectionist  problems.  Thus, 
the  argument  (to  the  extent  it  is  successful) 
explains  the  emergence  of  symbolic  proper¬ 
ties  and  the  centrality  of  analogy. 

THE  DESIGN  PROBLEM 

In  order  to  specify  the  design  problem  that 
is  the  basis  of  this  argument  it  is  necessary  to 
state  the  functional  capabilities  that  are  required 
of  the  final  system  and  the  design  of  the  initial 
connectionist  system. 

REQUIRED  FUNCTIONAL  CAPABILITIES 
Specifying  higher  cognition  in  entirety  is 
obviously  too  ambitious  a  sub-goal  for  this  pa¬ 
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per.  The  critical  cognitive  characteristics  sought 
arc  embodied  in  the  ability  to  follow  novel  in¬ 
structions.  To  make  this  concrete  we  will  settle 
on  an  arbitrary  but  paradigmatic  task*  to  be 
carried  out  by  the  cognitive  system.  The  task  is 
to  be  able  to  follow  novel  instructions,  such  as: 

I  will  show  you  an  artificially  coloured 
picture  of  an  animal  and  play  the  sound  of  an 
animal.  If  the  sound  belongs  to  the  pictured 
animal  you  must  name  the  colour  of  the  pic¬ 
tured  animal,  otherudse  name  the  animal  that 
made  the  sound. 

For  the  purposes  of  this  paper  the  issues  of 
language  understanding  required  for  compre¬ 
hension  of  the  instruction  arc  ignored  and  we 
focus  on  complying  with  the  instruction  once 
it  has  been  comprehended. 

INITIAL  CONNECTIONIST  SYSTEM 

The  system  is  to  be  implemented  with  typ¬ 
ical  connectionist  units.  That  is,  each  unit  may 
have  multiple  inputs  and  a  single  output.  All 
inputs  and  outputs  arc  to  be  graded,  scalar  quan¬ 
tities.  The  output  is  a  nonlinear  monotonic  func¬ 
tion  of  the  weighted  sum  of  the  inputs  or  prod¬ 
ucts  of  groups  of  the  inputs. 

The  system  is  to  have  a  fixed  architecture. 
That  is,  the  pattern  of  interconnection  of  units 
is  not  to  vary  or  be  constructed  as  a  function  of 
the  current  task.  For  example,  this  rules  out  the 
ACME  model  of  analogical  mapping  (Holyoak 
&  Thagard,  1989)  as  a  permissible  architecture 
because  the  neural  net  is  constructed  specifi¬ 
cally  for  each  problem. 

The  final  constraint  on  the  connectionist  ar¬ 
chitecture  arises  from  the  nature  of  the  task.  The 
system  is  to  implement  the  “top”  level  of  cogni¬ 
tion.  Therefore,  it  must  be  capable  of  integrat¬ 
ing  multiple  sensory  modalities.  We  assume  that 
the  full  system  will  have  other  levels  where  the 
sensory'  modalities  are  processed  separately.  Tlie 
boundary  of  the  system  of  interest  is  to  be  ex- 


*  This  paper  wa*;  inspired  hy  Hadley  (1998).  He  used 
an  example  of  following  novel  instructions  to  motivate  his 
argument  Ih.at  most  human  mental  skills  must  a'side  in  sep¬ 
arate  connectionist  modules  and  themby  instantiate  a  “clas¬ 
sical  architecture'*. 
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panded  until  it  reaches  the  point  at  which  the 
modalities  are  separately  represented. 

REPRESENTATIONAL  DESIGN 
DECISIONS 

REPRESENTING  NOVEL  CONCEPTS 

The  first  design  problem  to  address  is  rep¬ 
resentational  flexibility.  The  paradigmatic 
task  requires  the  creation  of  new  concepts. 
At  the  very  least  it  requires  a  concept  of  the 
instruction  to  be  followed.  Therefore,  the 
system  must  be  capable  of  representing  arbi¬ 
trary  new  concepts^ . 

This  constraint  rules  out  local  representa¬ 
tions  where  each  unit  has  a  fixed  meaning.  If 
all  the  units  are  pre-allocated  to  concepts  how 
can  new  concepts  be  represented?  It  also  seems 
implausibly  wasteful  to  have  unallocated  units 
waiting  to  be  allocated  to  what  may  be  ephem¬ 
eral  concepts. 

This  constraint  can  be  avoided  by  using 
distributed  representations  (that  is,  the  rep¬ 
resentation  consists  of  the  pattern  of  activi¬ 
ties  across  the  units).  However,  not  all  dis¬ 
tributed  representations  avoid  the  problem. 
The  same  argument  would  apply  to  distrib¬ 
uted  representations  where  the  individual 
units  have  fixed  meanings  as  features.  Any 
fixed  allocation  of  meanings  to  units  will  lim¬ 
it  the  representational  possibilities. 

A  related  argument  comes  from  the  re¬ 
quirement  that  the  system  should  integrate  in¬ 
formation  from  multiple  modalities.  Below 
the  point  of  integration  the  information  from 
separate  modalities  travels  on  separate  path¬ 
ways.  Above  the  point  of  integration  it  would 
be  possible  to  have  disjoint  segments  of  the 
representation  devoted  to  different  modali¬ 
ties,  but  this  would  be  wasteful  of  represen¬ 
tational  resources  (units). 


-  Any  use  of  “concept”  and  “representation”  begs 
many  philosophical  questions.  For  current  purposes  read 
“concept”  as  “a  mental  state”  and  “representation”  as  “a 
physical  slate  standing  for  a  mental  state”. 


It  is  possible  to  have  information  from  sep¬ 
arate  modalities  represented  over  the  same  units 
at  different  times  provided  that  context  infor¬ 
mation  is  available  as  part  of  the  representa¬ 
tion.  This  will  allow  the  representation  to  be 
interpreted  differently  depending  on  the  source. 

This  style  of  representation  results  in  the 
units  having  context  dependent  meanings.  The 
activities  of  individual  units  become  meaning¬ 
less  unless  they  are  able  to  be  interpreted  in 
the  context  of  the  activities  in  the  other  units 
in  the  representation-^ . 

Therefore,  the  first  design  decision  is  to 
use  a  distributed  representation  where  the  in¬ 
dividual  units  do  not  have  fixed  meanings'* . 
Kanerva  (1995)  has  also  argued  that  fixed  fea¬ 
ture  representations  are  impractical  for  open- 
ended  domains. 

IMMEDIATE  LEARNING 
The  next  design  problem  arises  from  the 
need  for  immediate  learning.  The  system 
must  be  able  to  learn^  the  novel  concepts 
immediately  from  a  single  exposure.  This 
rules  out  iterative  weight  adjustment  tech¬ 
niques  such  as  backpropagation  because 
they  are  too  slow,  typically  requiring  thou¬ 
sands  of  exposures^ . 

What  we  want  is  that  some  specific  output 
pattern  (representing  the  novel  concept)  should 
be  produced  in  response  to  a  specific  combina¬ 
tion  of  input  patterns.  This  is  equivalent  to  say¬ 
ing  that  we  want  to  associate  the  input  and  out- 


^  Context  dependency  does  not  have  to  be  all  or 
none.  At  one  end  of  the  scale  we  can  put  representations 
where  the  unit  activities  can  be  interpreted  in  isolation.  At 
the  other  end  of  the  scale  we  have  representations  in  which 
only  the  entire  pattern  of  activations  has  significance.  Be¬ 
tween  these  extremes  are  representations  where  subsets  of 
the  activation  pattern  may  be  assigned  meanings. 

*  The  degree  of  context  dependency  involves  trade¬ 
offs.  At  the  context-independent  end  we  restrict  the  repre¬ 
sentational  capacity  and  flexibility.  At  the  total  pattern  end 
any  corruption  of  the  pattern  would  completely  change  the 
meaning.  We  suspect  that  a  good  trade-off  might  exist  not 
too  far  from  the  context-independent  end  of  the  scale  where 
each  unit  is  interpretable  in  the  context  of  a  small  number 
of  other  units  (relative  to  the  total  number  of  units)  and  par¬ 
ticipates  in  multiple,  overlapping,  meaningful  subpatterns. 
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put  patterns  and  to  retrieve  the  output  pattern 
given  the  inputs  as  a  cue. 

This  can  be  achieved  with  binding  opera¬ 
tors  for  associating  patterns  and  unbinding  op¬ 
erators  for  retrieving  components  from  bound 
patterns.  Genericaily,  if  b=bind(x,y)  then 
unbind(b,x)=y  where  b,  x,  and  y  are  patterns.  A 
variety  of  binding  operators  have  been  devel¬ 
oped  (Gayler,  1998;  Kanerva,  1996;  Plate, 
1994;  Smolensky,  1990)\  AH  the  binding  and 
unbinding  operators  are  able  to  be  implement¬ 
ed  as  connectionist  primitives  able  to  operate 
in  a  single  time  step. 

In  the  example  above  the  patterns  x  and  y 
may  be  taken  as  input  and  output  patterns  re¬ 
spectively.  The  pattern  b  (which  is  able  to  be 
created  in  a  single  time  step)  represents  the  as¬ 
sociation  of  X  and  y.  If  b  is  present*  in  an  envi¬ 
ronment  where  unbinding  occurs  automatical¬ 
ly,  the  presentation  of  the  input  pattern  x  will 
result  in  the  creation  of  the  output  pattern  y. 

The  corresponding  design  decision  is  that 
the  connectionist  system  should  implement 
immediate  learning  as  pattern  association  via 
bind  and  unbind  operators. 

COMPATIBILITY  OF  REPRESENTATIONS 

The  next  problem  arises  because  the  para¬ 
digmatic  task  requires  close  interaction  of  short 
term  and  long  term  knowledge.  The  task  calls 
on  pre-existing  skills  such  as  animal  identifi- 


*  By  learning  we  mean  changing  the  state  of  the 
system  so  that  future  occurrences  of  the  novel  concept  aa* 
recognised.  This  changed  state  must  be  able  to  persist  long¬ 
er  than  the  immediate  span  of  attention 

*'  This  is  not  to  say  that  iterative  weight  adjustment 
procedures  have  no  place  in  connectionist  systems,  only  ih.at 
they  are  inadequate  for  short  term  cognitive  learning 

’  Terminology  differs  between  systems  and  there 
is  some  scope  for  confusion  as  each  system  has  at  least  two 
distinct  operators  that  might  be  called  binding.  Wc  use  the 
term  "binding"  for  what  might  best  be  called  structural  bind¬ 
ing.  In  this  case  each  of  the  components  Is  structurally  re¬ 
quired  (also  called  rolc/fillcr  binding  orattributefvaluc  bind¬ 
ing).  Wc  use  the  term  "bundling"  for  what  might  be  called 
decorative  binding.  In  this  case  each  of  the  components  is 
optional  (for  example,  the  slots  of  a  frame).  This  is  called 
superposition  or  chunking.  (The  latter  term  is  used  differ¬ 
ently  by  different  authors.) 


cation  and  requires  their  integration  with  the 
short  term  concepts  of  the  task  and  the  work  in 
progress.  Therefore,  there  is  a  requirement  that 
the  system  is  able  to  integrate  short  term  and 
long  term  knowledge. 

In  a  traditional  connectionist  system  short 
term  knowledge  is  usually  implemented  as  ac¬ 
tivations  of  units  and  long  term  knowledge  as 
connection  weights.  This  implementation  cap¬ 
tures  the  relative  persistence  of  the  two  types 
of  knowledge.  However,  as  abstract  represen¬ 
tations,  the  activation  vector  and  weight  ma¬ 
trix  arc  incommensurable  and  can  only  indi¬ 
rectly  influence  each  other  via  the  processing 
function  of  the  system.  Our  intuition  is  that  the 
use  of  such  different  representations  for  short 
and  long  term  knowledge  will  make  integra¬ 
tion  difficult. 

Therefore,  the  next  design  decision  is  to 
require  short  and  long  term  knowledge  to  be 
represented  (though  not  necessarily  imple¬ 
mented)  in  the  same  way.  That  is,  bindings  in 
short  and  long  term  memory  must  have  iden¬ 
tical  representations  and  have  identical  effects 
on  operations. 

These  properties  automatically  come  from 
binding  methods  that  are  based  on  element-wise 
multiplication  operations  of  terms.  In  connec¬ 
tionist  systems  activations  and  weights  inter¬ 
act  multiplicatively.  Therefore,  in  a  binding 
operation  short  term  knowledge  (activations) 
and  long  term  knowledge  (weights)  may  be  used 
interchangeably  provided  that  they  are  all  rep¬ 
resented  as  vectors  of  the  same  dimension. 

A  newly  created  binding  may  be  kept  as 
an  activation  pattern  or  added  into  a  weight 
vector  with  equivalent  effect.  Thus  shon  and 
long  term  knowledge  arc  identical  in  terms  of 
their  ability  to  be  interrogated  by  current  pro¬ 
cessing.  The  only  asymmetry  between  the  two 
storage  forms  is  that  whereas  activation  pat¬ 
terns  may  interact  with  other  activation  pat¬ 
terns  and  weight  patterns,  weight  patterns  may 


■  Binding  oddrcscc^  the  iscuc  of  crenting  the  n‘ico- 
ciation  Other  connectlonkt  mechnnkiii«^,  able  to  opemie  in 
a  single  time  step,  exist  to  make  the  representation  of  the 
association  persistent 
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not  interact  with  other  weight  patterns  except 
via  the  mediation  of  an  interaction  with  an 
activation  pattern. 

DISCUSSION  OF  REPRESENTATIONS 

BINDINGS  AS  VIRTUAL  NEURONS 

Historically,  single  cell  recording  allowed 
neurophysiologists  to  identify  stimuli  that 
caused  a  neuron  to  fire.  Over  time  researchers 
discovered  cells  responsive  to  progressively 
more  complex  stimuli.  This  was  paralleled  in 
the  psychophysical  literature  by  hypotheses  of 
progressively  more  complex  feature  detectors. 
This  led  to  the  infamous  “grandmother  cell”  as 
a  reductio  ad  absurdum  argument  against  com¬ 
plex  feature  detectors. 

It  seems  implausible  and  wasteful  that  the 
brain  might  come  prestocked  with  feature  de¬ 
tectors  for  every  concept  that  a  person  might 
possibly  encounter  (let  alone  what  our  descen¬ 
dants  might  encounter).  This  problem  might  be 
avoided  if  feature  detectors  could  be  created 
instantly  on  demand.  This  would  be  very  diffi¬ 
cult  to  achieve  with  real  neurons  but  would  be 
feasible  if  the  feature  detectors  were  virtual 
neurons  implemented  on  a  fixed  neural  base. 

The  design  decisions  so  far  are:  that  con¬ 
cepts  will  be  represented  by  distributed  patterns 
with  individual  units  having  no  specific  mean¬ 
ing;  that  association  of  concepts  will  be  carried 
out  by  binding  operators  capable  of  immediate 
learning;  and  that  short  and  long  term  repre¬ 
sentations  of  bindings  will  be  equivalent,  dif¬ 
fering  only  in  their  form  of  storage  and  ability 
to  interact  with  other  bindings.  These  bindings 
may  be  thought  of  as  virtual  neurons.  They  are 
like  neurons  because  for  each  binding  an  out¬ 
put  pattern®  may  be  retrieved  by  presentation 
of  the  associated  input  pattern.  They  are  virtu¬ 
al  because  any  number  of  neurons  (bindings) 
may  be  implemented  in  a  fixed  number  of  real 
neurons  or  connectionist  units  (subject  to  soft 
capacity  constraints). 

The  advantages  of  virtual  neurons  imple¬ 
mented  as  bindings  are:  that  these  neurons  can 
be  created  on  demand  to  represent  novel  con¬ 


cepts;  that  they  can  be  created  in  a  single  expo¬ 
sure;  that  they  may  be  ephemeral  or  permanent 
depending  on  whether  they  exist  only  as  acti¬ 
vation  vectors  or  are  stored  in  long  term  mem¬ 
ory  as  weight  changes, 

BINDINGS  AND  SYSTEMATICITY 

Fodor  and  Pylyshyn  (1988)  argued  that  one 
of  the  hallmarks  of  cognition  that  is  explicable 
by  a  symbolic  approach  but  not  by  connection¬ 
ist  models  is  systematicity.  This  is  the  property 
that  having  the  capability  to  represent  some 
concepts  necessarily  entails  the  capability  to 
represent  other  related  concepts.  For  example, 
they  argued  that  a  cognitive  system  able  to  think 
Mary  loves  John  must  necessarily  be  able  to 
think  John  loves  Mary. 

Bindings  automatically  provide  systema¬ 
ticity.  If  two  patterns  are  bound  together  either 
may  be  used  as  a  cue  for  the  retrieval  of  the 
other.  The  notions  of  input  and  output,  which 
are  meaningful  for  real  neurons,  are  not  rele¬ 
vant  to  bindings  as  virtual  neurons.  In  effect, 
whenever  a  binding  implements  a  virtual  neu¬ 
ron  mapping  x  to  y  it  also  necessarily  imple¬ 
ments  a  virtual  neuron  mapping  y  to  x. 

It  could  be  objected  that  producing  all  map¬ 
pings  between  the  bound  patterns  is  not  neces¬ 
sarily  desirable.  We  agree,  but  discount  this 
objection  for  two  reasons.  Firstly,  we  are  con¬ 
cerned  with  higher  cognitive  functions.  At  this 
level  we  expect  the  major  demand  to  be  maxi¬ 
mal  exploitation  of  the  available  knowledge, 
Systematicity  is  a  mechanism  for  generating 
hypotheses  from  prior  knowledge  to  as  yet  un¬ 
encountered  situations.  Halford  (1996)  has  also 
argued  for  this  capability  which  he  labels 
“omni-directional  access”. 

The  second  reason  is  that  we  have  as¬ 
sumed  that  where  it  is  important  to  limit 
the  hypotheses  generated  by  systematicity 
this  will  be  achieved  by  the  details  of  the 
representational  scheme.  For  example, 

®  Note  that  now  the  output  is  a  pattern  rather  than 
a  scalar  value.  This  is  a  consequence  of  distributed  repre¬ 
sentation  rather  than  virtualisation.  In  a  distributed  repre¬ 
sentation  the  vector  of  scalar  outputs  of  a  group  of  units  is 
the  more  convenient  level  of  analysis  of  output. 
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Smolensky  (1990)  proposed  a  representa¬ 
tional  scheme  in  which  fillers  were  bound 
to  roles  rather  than  directly  to  each  other. 
For  example,  John  loves  Mary  might  be 
represented  as  bundle(bJnd(lover,John), 
bind(loved,Mary))  where  lover  and  loved  arc 
roles  and  John  and  Mary  arc  fillers,  rather  than 
blnd(loves  John, Mary)  where  loves,  John  and 
Mary  are  all  fillers.  In  Smolenskys  representa¬ 
tion  the  systematicity  would  be  expressed  with 
respect  to  role/filler  pairs  rather  than  directly 
between  fillers. 

BINDINGS  AS  RULES  ON  CONSTANTS 

We  have  argued  that  bindings  may  be 
viewed  as  implementing  virtual  neurons.  They 
may  also  be  viewed  as  implementing  rules  or 
productions  restricted  to  literal  constants.  A  bind¬ 
ing  of  X  with  y  may  be  construed  as  implement¬ 
ing  the  rules  IF  x  thfn  y  and  if  y  thfn  x.  Similar¬ 
ly  for  higher  order  bindings  of  x,  y,  and  z: 
IF  bind(x,y)  thf.n  ^  if  bind(x,z)  thfk  y, 
if  x  thfn  bind(y,z),  and  so  on.  The  restriction  on 
the  rules  is  that  the  antecedent  and  consequent 
consist  only  of  literal  constants  because  the  bind¬ 
ings  are  between  constant  pattern  vectors. 

However,  bindings  do  implement  an  exten¬ 
sion  relative  to  traditional  symbolic  rules.  Be¬ 
cause  bindings  are  represented  as  pattern  vec¬ 
tors  they  acquire  some  properties  from  vector 
arithmetic.  The  pattern  vectors  may  be  multi- 
plicatively  scaled  and  added.  Thus  it  makes 
sense  to  talk  about  a  binding  operating  on  a 
mixture  of  patterns.  In  most  binding  systems 
bind(x,y+z)  =  bind(x,y)  +  bind(x,z).  It  is  also 
possible  to  have  graded  similarity  between  vec¬ 
tors.  This  allows  rules  implemented  as  bind¬ 
ings  to  operate  in  a  graded  fashion. 

PROCESSING  DESIGN  DECISIONS 

GENERATE  DISSIMILAR  VECTORS 

The  next  design  decision  is  required  as  a 
consequence  of  an  earlier  decision.  Recall  that 
we  decided  to  use  distributed  pattern  vectors  to 
represent  new  concepts.  Every  time  a  new  con¬ 
cept  is  created  a  new  pattern  will  be  required  to 


represent  it.  What  constraints  might  exist  on  the 
patterns  that  can  be  used  for  new  concepts? 

Thinking  of  the  binding  as  a  virtual  neuron 
there  will  be  one  or  more  input  patterns  to  be 
associated  with  the  output  pattern  representing 
the  novel  concept.  The  binding  process  impos¬ 
es  no  constraint  on  the  choice  of  output  vector 
because  any  vectors  may  be  bound''’.  The  ma¬ 
jor  constraint  is  that  novel  concepts  require 
novel  concept  vectors.  They  should  not  be  iden¬ 
tical  to  any  pre-existing  concept  vector  other¬ 
wise  the  combination  of  inputs  will  be  bound 
to  a  pre-existing  concept. 

In  standard  connectionist  models  much  use 
is  made  of  the  fact  that  there  is  a  graded  simi¬ 
larity  relation  between  vectors.  To  the  extent 
that  one  vector  is  similar  to  another  it  is  able  to 
Stand  in  for  the  other  vector  in  further  process¬ 
ing.  If  a  new  concept  is  truly  novel  the  pattern 
representing  it  must  not  be  similar  to  any  pre¬ 
existing  vectors  in  order  to  avoid  having  effects 
similar  to  some  pre-existing  concept. 

Given  immediate  learning,  pattern  vectors 
for  new  concepts  will  be  created  at  a  point  when 
the  system  is  in  a  state  of  ignorance  about  the 
potential  relationships  between  the  new  con¬ 
cept  and  any  pre-existing  concepts.  Thus,  even 
if  the  new  concept  vector  should  ideally  be  sim¬ 
ilar  to  some  pre-existing  concept  vector  it  must 
necessarily  be  created  dissimilar  to  all  pre-ex¬ 
isting  concept  vectors. 

The  corresponding  design  decision  is  that 
vectors  representing  new  concepts  should  be 
generated  to  have  zero  or  minimal  similarity 
to  all  pre-existing  concept  vectors.  It  will  sim¬ 
ply  be  assumed  that  such  a  generation  mecha¬ 
nism  is  feasible. 

Vector  Generation  Mechanisms 

We  discuss  some  possible  generation 
mechanisms  with  no  particular  commitment  to 
any  of  them.  The  least  interesting  possibility 


There  mny  be  some  exceptions  In  iniiltiplien- 
livc  binding  (Gaytcr,  1998)  a  paltcm  may  not  be  bound  to 
itsetf.  This  is  not  a  problem  in  Itu*  current  case  because  it 
implies  that  the  output  vector  is  identical  to  one  of  the  input 
vectors  (in  which  case  the  outpirt  vector  has  already  been 
assigned  to  another  concept) 
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for  creating  dissimilar  vectors  is  to  rely  on  ran¬ 
dom  noise  in  the  system.  Random  high  dimen¬ 
sional  vectors  have  close  to  zero  similarity  on 
average.  An  extra  mechanism  might  be  required 
for  those  rare  occasions  when  the  new  vector 
happened  to  be  similar  to  an  old  vector. 

Another  mechanism  (possibly  an  implemen¬ 
tation  of  the  previous  one)  is  to  modify  a  system 
with  recurrent  dynamics,  similar  to  the  Brain- 
State-in-a-Box  (Anderson,  Silverstein,  Ritz,  & 
Jones,  1977),  such  that  pre-existing  vectors  be¬ 
come  repellors  in  the  state  space.  When  present¬ 
ed  with  a  novel  stimulus  the  system  would  settle 
into  a  state  different  from  any  previously  encoun¬ 
tered  state.  The  randomness  in  the  choice  of  the 
new  pattern  might  come  from  chaotic  dynamics 
or  the  amplification  of  innate  noise. 

The  most  interesting  possibility  is  that  the 
novel  concept  vector  might  be  created  as  a  side 
effect  of  binding.  For  example,  the  pattern  rep¬ 
resenting  the  binding  (or  some  deterministic 
function  of  the  binding  pattern)  of  the  inputs 
could  be  used  as  the  output  pattern.  Multipli¬ 
cative  binding  (Gayler,  1998)  is  essentially  a 
randomising  operation.  The  representation  of 
bind(x,y)  is  not  similar  to  either  x  or  y  when  x 
and  y  are  dissimilar  (as  we  argue  they  should 
be  if  they  represent  concepts  of  higher  cogni¬ 
tion)’*.  Any  novel  combination  of  representa¬ 
tions  to  be  bound  will  necessarily  generate  a 
binding  that  is  novel.  This  avoids  the  system 
having  to  decide  when  to  create  a  new  repre¬ 
sentation.  If  the  combination  is  novel  the  bind¬ 
ing  and  the  output  pattern  will  also  be  novel. 

Vectors  and  Classical  Symbols 

The  decision  to  represent  novel  concepts 
with  vectors  dissimilar  to  all  pre-existing  vec¬ 
tors  leads  to  discrete  representations.  In  gener¬ 
al,  two  vector  representations  will  either  be 
identical  or  dissimilar’^ . 

The  ability  to  bind  arbitrary  input  and  out¬ 
put  representations  means  that  the  output  rep¬ 
resentations  can  have  arbitrary  referents.  The 
actual  arbitrariness  of  the  linkages  will  be 
guaranteed  to  the  extent  that  representations  for 
new  concepts  are  generated  at  random. 


Classical  symbols  are  discrete  and  arbitrary, 
in  that  two  symbols  are  either  the  same  or  dif¬ 
ferent  and  that  the  form  of  a  symbol  has  no 
necessary  dependence  on  the  referent  of  the 
symbol.  Thus,  the  design  decisions  so  far  have 
led  to  connectionist  representations  that  behave 
like  classical  symbols. 

SYSTEMATIC  VECTOR  SUBSTITUTION 

The  design  decisions  taken  so  far  have  gen¬ 
erated  a  new  problem.  If  the  representations  of 
concepts  are  primarily  dissimilar  and  arbitrary 
how  can  they  be  used?  Traditional  connection¬ 
ist  processing  relies  on  the  similarity  of  vector 
representations,  which  has  been  removed  by  the 
design  decisions. 

If  the  individual,  isolated,  pattern  vectors 
do  not  carry  information,  what  does?  We  be¬ 
lieve  that  the  information  must  be  carried  in 
the  structural  interrelationships  of  the  bindings. 
During  any  episode  the  system  will  be  creating 
many  bindings  which  will  be  interrelated  by  the 
individual  pattern  vectors  they  have  in  com¬ 
mon’^  .  On  a  subsequent  occasion  the  new  epi¬ 
sode  will  be  recognised  as  equivalent  to  the 
previous  episode  if  the  structural  interrelations 
of  the  bindings  are  the  same  (even  though  the 
pattern  vectors  composing  the  bindings  may 
differ).  This  structural  equivalence  is  proved  if 
a  systematic  substitution  of  pattern  vectors  in 
the  current  episode  yields  the  previous  trace"* . 


‘ '  In  all  the  binding  systems  mentioned  earlier  (Gay¬ 
ler,  1998;  Kanerva,  1996;  Plate,  1994;  Smolensky,  1990) 
the  bindO  operator  reduces  the  similarity  of  compounds  com¬ 
pared  to  their  components.  Considering  the  limiting  case, 
the  similarity  of  bind(x,y)  to  x  and  y  is  zero  when  the  simi¬ 
larity  of  X  and  y  is  zero.  Thus  the  overall  effect  of  binding  is 
to  make  representations  less  similar  and  the  introduction  of 
a  single  dissimilar  pattern  will  render  dissimilar  every  bind¬ 
ing  in  which  it  participates.  We  are  not  denying  the  impor¬ 
tance  of  those  occasions  when  patterns  and  their  bindings 
are  similar,  rather  we  are  focusing  on  the  occasions  of  dis¬ 
similarity  because  we  believe  they  are  more  relevant  to  higher 
cognition  and  have  been  ignored  by  connectionists. 

It  is  possible  for  two  representations  to  be  simi¬ 
lar  and  for  processing  to  depend  on  that  similarity.  Howev¬ 
er,  in  a  high  dimensional  vector  space  the  proportion  of  vec¬ 
tors  similar  to  any  given  vector  becomes  very  small.  Most 
pairs  of  vectors  chosen  at  random  will  be  dissimilar. 
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The  corresponding  design  decision  is  that 
the  basic  mode  of  processing  of  the  conncction- 
ist  system  should  be  continuous  systematic  sub¬ 
stitution  of  representations  for  the  elaboration 
of  the  currently  active  representations  from  pre¬ 
vious  representations. 

Systematic  substitution  is  relatively  sim¬ 
ple  with  the  binding  systems  mentioned  earli¬ 
er.  For  example,  in  multiplicative  binding,  bind¬ 
ing  any  structure  to  bind(a,b)  will  result  in  re¬ 
placing  all  occurrences  of  a  in  the  structure  with 
b  (and  b  with  a)'^ 

Rules  with  Variables 

An  interesting  consequence  of  this  deci¬ 
sion  is  that  it  turns  every  constant  pattern  vec¬ 
tor  into  a  variable  because  every  constant  may 
be  systematically  substituted.  Earlier  it  was 
noted  that  bindings  could  be  thought  of  as  rules 
restricted  to  constant  terms.  Without  chang¬ 
ing  the  representation  systematic  substitution 
allows  every  binding  to  function  as  a  rule  with 
variables.  Thus,  any  encoding  from  any  epi¬ 
sode  (even  if  encountered  only  once)  becomes 
available  as  a  generalised  rule  to  the  extent 
that  it  can  be  unified  by  systematic  substitu¬ 
tion  with  other  structures. 

It  might  be  the  case  that  in  most  circumstanc¬ 
es  systematic  substitution  is  not  needed  because 
there  is  literal  similarity  between  the  represen¬ 
tations.  Processing  based  on  literal  similarity  is 
equivalent  to  systematically  substituting  each 
pattern  vector  for  itself  Thus,  it  could  be  the  case 


'■  A  rcprc.scntation  bundle(prop(lovcs, Chris, Pat), 
prop(lovc.s, Pat, Robin),  prop(lovc.s,Robin,Chri.s))  would 
be  structurally  equivalent  to  any  representation  of  the  form 
bundle(propfl,c,p),  prop(!,p,r),  propn,r,c)) 

**  Given  the  episode  bundle(prop(lovcs^lcx,.terry), 
|Hiop(lovcsJciTyJ^),  prop(loves,I^c,Alcx))  the  systematic 
substitutions  {Alex '❖Chris,  JerryOPat, Lccc>Robin }  would 
transform  it  into  the  previous  episode.  Equivalently,  the  pre¬ 
vious  episode  could  be  transformed  to  the  current  episode 
by  systematic  substitution.  For  current  purposes,  we  ignore 
the  representational  details  governing  whether  substitution 
for  the  relation  (loves)  is  allowed  and  whether  roles  and  fill¬ 
ers  might  be  interchangeable. 

For  other  substitution  mechanisms  based  on  bind¬ 
ing  sec  Halford,  Wilson,  Guo.  Gayler,  Wiles  &  Stewart 
(1994),  Kanerva  (1997),  and  Plate  (1997). 


that  systematic  substitution  is  continually  occur¬ 
ring  in  the  conncctionist  system  but  we  can  only 
detect  it  when  we  arc  operating  in  domains  where 
literal  similarity  is  not  available. 

Unification  and  Analogical  Mapping 

This  design  decision  asserts  that  system¬ 
atic  substitution  is  a  necessary  component  of 
the  cognitive  process  because  of  the  conse¬ 
quences  of  the  earlier  design  decisions.  Sys¬ 
tematic  substitution  is  at  the  heart  of  analogi¬ 
cal  mapping.  Therefore,  we  arc  asserting  that 
analogical  mapping  is  a  necessary  component 
of  the  cognitive  process. 

It  is  worth  expanding  on  the  possibility  that 
analogical  mapping  may  be  at  the  heart  of  cogni¬ 
tion.  Wc  referred  earlier  to  the  problem  of  the 
system  knowing  when  to  create  noN^cl  representa¬ 
tions  and  suggested  that  one  possibility  is  that  the 
representations  are  functionally  dependent  on  the 
inputs  combined.  That  is,  novel  combinations  of 
inputs  would  result  in  novel  representations. 

Given  that  very  few  situations  arc  iden¬ 
tical  (especially  if  you  consider  the  goals  of 
the  cognitive  system  as  a  representable  com¬ 
ponent  of  the  situation)  this  has  the  potential 
to  make  every  representation  a  novel  repre¬ 
sentation.  The  mechanism  proposed  here  to 
overcome  this  is  to  use  a  continuous  process 
of  systematic  substitution  to  unify*^'  the  cur¬ 
rent  representation  with  all  previously  en¬ 
countered  repre.sentations. 

We  also  referred  earlier  to  the  multiplex¬ 
ing  of  information  from  multiple  modalities 
over  the  same  representational  resources.  This 
imposed  a  requirement  for  context  to  be  repre¬ 
sented  as  a  component  of  the  content  and  made 
the  interpretation  of  representations  context  de¬ 
pendent.  As  the  number  of  contextual  states  in¬ 
creases  this  also  has  the  effect  of  turning  each 
representation  into  a  novel  representation  (even 
if  the  entity  being  represented  remains  constant). 


Unification  is  a  proof  technique  used  in  logic  pro¬ 
gramming  which  uses  substitution  of  variables  to  make  terms 
equivalent  It  is  computationally  expcnslNr  implemented 
by  a  symbolic  algorithm  Weber  ( 1997)  developed  a  connec- 
lionist  system  (not  Kased  on  the  binding  techniques  discussed 
here)  that  implements  unification  in  cxinst.ant  time 
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Therefore,  it  is  possible  that  the  cognitive 
system  is  thoroughly  promiscuous  in  the  gen¬ 
eration  of  new  representations  and  that  these 
representations  will  appear  to  be  random  when 
viewed  in  isolation.  How  are  these  novel  rep¬ 
resentations  to  be  interpreted  and  acted  upon  if 
they  appear  random? 

A  similar  problem  arises  in  the  technicalities 
of  vector  systems.  Any  vector  can  be  decomposed 
into  contributions  from  a  set  of  basis  vectors. 
By  requiring  new  concept  representations  to  be 
arbitrary  and  dissimilar,  we  have  removed  from 
the  underlying  hardware  the  possibility  of  hav¬ 
ing  a  distinguished  basis  set  (fixed  features)  for 
decomposing  composite  structures  (because  the 
basis  vectors  are  now  the  concept  vectors  which 
are  not  known  until  they  are  created). 

When  faced  with  a  pattern  vector,  how  does 
the  system  know  whether  it  represents  a  new 
concept  or  some  composite  structure  that  may 
be  decomposed  into  other  representations?  This 
is  important  because  a  composite  structure  needs 
to  be  exploited  by  integrating  it  with  related 
knowledge,  whereas  a  novel  concept  should  not 
be  spuriously  integrated  with  prior  knowledge. 

The  solution  proposed  is  to  decompose  it 
with  respect  to  the  other  structures  that  already 
exist  in  short  and  long  term  memory.  Our  intu¬ 
ition  is  that  this  might  be  carried  out  as  a  con¬ 
tinuous  process  of  activation  spreading  from  the 
active  representations  in  short  term  memory, 
through  the  inactive  representations  in  long 
term  memory,  creating  further  activations  in 
short  term  memory. 

If  the  process  of  propagating  activation  si¬ 
multaneously  pursues  many  systematic  substi¬ 
tutions  a  shower  of  new  activations  will  be  cre¬ 
ated.  Those  activations  that  are  identical  with 
or  consistently  extend  the  pre-existing  activa¬ 
tions  will  reinforce  those  patterns  and  them¬ 
selves,  while  inconsistent  mappings  die  out  as 
noise  or  remain  as  suppressed  alternative  de¬ 
compositions  to  pursue  if  the  current  one  be¬ 
comes  inconsistent. 

This  process  can  be  viewed  as  a  competi¬ 
tion  between  potential  decompositions.  The  first 
and  most  consistent  decomposition  would  be 
more  successful  than  competing  decompositions 


at  creating  the  feedback  activations  to  reinforce 
itself.  Thus  the  decomposition  that  occurs  is  the 
(or  a)  correct  interpretation  of  the  structure  (by 
virtue  of  its  success).  Other  potential  decompo¬ 
sitions  are  incorrect  interpretations  of  the  stiuc- 
ture  because  they  were  less  successful  at  inte¬ 
grating  the  active  representations  and  long  term 
memory .  This  automatically  achieves  the  desired 
result  that  a  concept  vector  should  be  decom¬ 
posed  and  integrated  where  ever  possible. 

CONCLUSION 

We  have  suggested  a  series  of  connection- 
ist  design  decisions  that  seem  to  follow  natu¬ 
rally  from  the  nature  of  the  cognitive  task  by 
respecting  the  essence  of  connection! st  com¬ 
putation.  These  decisions”  are: 

•  use  distributed  representations  where  the 
individual  units  do  not  have  fixed  meanings; 

•  implement  immediate  learning  as  pattern 
association  via  bind  and  unbind  operators; 

•  bindings  in  short  and  long  term  memory 
must  have  identical  representations; 

•  vectors  representing  new  concepts  should 
be  dissimilar  to  all  pre-existing  concept 
vectors; 

•  the  basic  mode  of  processing  should  be 
continuous  systematic  substitution  of  pat¬ 
tern  vectors. 

As  necessary  consequences  of  these  deci¬ 
sions  the  connection! St  system  will  demonstrate 
behaviour  typical  of  classical  symbolic  systems 
and  place  analogy  as  the  primary  cognitive  pro¬ 
cess.  Interpretation  of  these  decisions  suggests 
that  cognition  is  promiscuously  analogical  and 
that  the  basic  mechanism  of  cognition  consists 
of  a  continuous  process  of  unification  through 


Connectionist  systems  exist  demonstrating  all  but 
the  last  decision  (which  we  believe  is  plausible  within  the 
current  state  of  the  art  of  systematic  substitution  by  binding 
and  unbinding).  Even  with  an  implementation  of  continu¬ 
ous  systematic  substitution  all  the  design  decisions  will  need 
to  be  integrated  and  there  will  be  auxiliary  problems  to  be 
solved  before  a  connectionist  model  of  higher  cognition  can 
be  demonstrated. 
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systematic  substitution  of  currently  active  rep¬ 
resentations  with  each  other,  all  representations 
in  long  term  memory,  and  new  representations 
being  created.  In  effect,  this  is  a  continuous  data 
mining  operation  on  a  massive  scale. 
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ABSTRACT 

An  important  aspect  of  the  process  of  form¬ 
ing  analogies  is  the  ability  to  extend  knowledge 
of  a  target  domain  by  virtue  of  its  similarity  to  a 
base  domain.  Extant  theories  of  analogy  sug¬ 
gest  that  information  is  carried  from  base  to  tar¬ 
get  when  it  is  connected  to  a  correspondence 
between  the  domains  and  is  structurally  consis¬ 
tent  with  the  current  match.  Some  theories  fur¬ 
ther  suggest  that  information  is  likely  to  be  car¬ 
ried  from  base  to  target  when  it  is  pragmatically 
relevant  to  the  current  situation.  I  present  stud¬ 
ies  that  examine  the  contributions  of  structure 
and  pragmatic  relevance  on  analogical  inference 
using  a  technique  in  which  people  play  the  role 
of  a  student  or  a  financial  officer  transferring 
from  one  college  to  another.  The  results  indi¬ 
cate  that  systematicity  and  pragmatic  relevance 
play  distinct  roles  in  analogical  inference. 

INTRODUCTION 

There  is  general  agreement  among  re¬ 
searchers  that  analogy  involves  sub-processes 
including  representing  the  domains,  finding  a 
mapping  between  them,  verifying  the  goodness 
of  the  mapping,  and  carrying  inferences  from 
one  domain  (called  the  base)  to  a  second  do¬ 
main  (called  the  target).  The  process  of  ana¬ 
logical  inference  has  been  the  object  of  study 
for  two  reasons.  First,  the  ability  to  create  ana¬ 
logical  inferences  is  an  important  avenue  of 
knowledge  change,  because  it  allows  one  do¬ 


main  to  be  extended  by  virtue  of  its  similarity 
to  another.  Second,  existing  computational 
models  of  analogy  disagree  on  the  mechanisms 
by  which  candidate  inferences  are  generated, 
and  so  data  that  bear  on  this  issue  will  help  con¬ 
strain  these  computational  models. 

Much  of  the  work  related  to  analogical  in¬ 
ference  has  been  done  in  the  context  of  transfer 
in  problem  solving  (e.g.,  Gick  &  Holyoak,  1 980; 
Ross,  1989).  This  work  has  focused  primarily 
on  how  whole  solutions  to  old  problems  can  be 
carried  over  to  new  ones.  More  recently,  work 
has  focused  on  factors  that  determine  which  piec¬ 
es  of  information  about  a  base  domain  are  likely 
to  be  inferred  of  a  target.  Two  central  constraints 
on  inference  that  have  been  studied  are  system¬ 
aticity  (Clement  &  Centner,  1991;  Markman, 
\991).  and  pragmatics  (Spellman  &  Holyoak, 
1996).  In  this  paper,  I  first  briefly  review  the 
work  on  systematicity  and  pragmatics.  Then,  I 
present  three  studies  that  examine  both  pragmat¬ 
ics  and  systematicity  in  order  to  examine  their 
relative  importance  as  constraints  on  inference. 

I  conclude  with  a  discussion  of  the  implications 
of  this  work  for  existing  models  of  analogy. 

SYSTEMATICITY  AND  PRAGMATICS 

Analogical  inference  must  be  constrained, 
because  not  every  fact  true  of  a  base  domain 
will  also  be  true  of  the  target.  Indeed,  for  dis¬ 
tant  analogues,  most  of  the  facts  about  the  base 
domain  will  not  be  true  of  the  target.  If  every 
fact  about  the  base  were  carried  to  the  target. 
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then  most  of  the  information  inferred  would  be 
false,  and  the  inference  process  would  not  be 
useful  (because  the  reasoner  would  waste  con¬ 
siderable  time  rejecting  false  inferences). 

The  first  constraint — systematicity — is  the 
notion  that  connected  relational  systems  are  pre¬ 
ferred  to  collections  of  individual  relations  (Cen¬ 
tner,  1983;  Centner,  1989).  Systematicity  con¬ 
strains  inference  by  requiring  that  the  facts  car¬ 
ried  over  from  the  base  be  connected  to  match¬ 
ing  information  in  the  target.  That  is,  the  infer¬ 
ences  must  involve  shared  system  facts.  The 
assumption  is  that  a  fact  connected  to  a  match 
between  base  and  target  is  more  likely  to  be  rel¬ 
evant  than  is  a  fact  not  connected  to  the  match. 

For  example,  imagine  that  you  know  two 
facts  about  a  friend; 

(1)  John  likes  to  eat  ice  cream  causing 
him  to  be  slightly  overweight. 

(2)  John  likes  old  movies  causing 
him  to  stay  up  late  watching  TV. 

Suppose  that  you  then  strike  up  an  email 
correspondence  with  a  new  person,  Mary,  who 
likes  old  movies.  Systematicity  suggests  that 
you  should  infer  that  Mary  stays  up  late  watch¬ 
ing  TV  (a  shared  system  fact)  rather  than  that 
Mary  is  slightly  overweight  (a  nonshared  sys- 

Base  Domain 


Biology  Department 

Great  teachers  cflM.rmg 
Students  to  be  motivated  to  leam 
Faculty  to  get  external  offers  and  leave 


Political  Science  Department 

Faculty  UkrgMC  causing 
Faculty  to  be  inaccessible 
Department  to  split  into  two  departments 


tern  fact).  Previous  studies  of  analogical  infer¬ 
ence  have  shown  that  people  making  analogi¬ 
cal  inferences  are  much  more  likely  to  infer 
shared  system  facts  than  nonshared  system  facts 
(Clement  &  Gcntncr,  1991;  Markman,  1997). 

In  addition  to  systematicity,  pragmatic  infor¬ 
mation  also  seems  useful  for  constraining  ana¬ 
logical  inference.  If  you  know  in  advance  that  a 
particular  piece  of  information  is  of  interest,  then 
you  should  be  more  likely  to  infer  that  informa¬ 
tion.  For  example,  if  you  strike  up  an  email  cor- 
re.spondence  with  Mary,  and  realize  that  she  gen¬ 
erally  reminds  you  of  your  friend  John,  then  you 
might  want  to  make  inferences  about  Mary  based 
on  what  you  know  about  John  (Andersen  &  Cole, 
1990).  If  you  are  particulariy  interested  in  wfiethcr 
Mary  watches  TV,  you  might  focus  selectively 
on  the  inference  that  she  stays  up  late  watching 
TV,  because  it  is  relevant  to  your  goals. 

Some  research  has  also  examined  the  in¬ 
fluence  of  pragmatic  information  on  inference 
(Spellman  &  Holyoak,  1996).  These  studies 
demonstrated  that  goals  active  when  process¬ 
ing  an  analogy  can  influence  what  information 
people  place  in  correspondence  when  making 
a  mapping,  and  can  also  influence  what  facts 
from  the  base  domain  are  drawn  as  inferences. 

In  the  present  studies,  we  examine  the  rel¬ 
ative  strengths  of  systematicity  and  pragmatics 

Target  Domain 


Music  Department 

Great  teachers 

Faculty  Argue 


Figure  1.  Ulustrathn  of  part  of  the  base  and  target  domains  for  the  experiments.  The  actual  materlats  were 
paragraph  descriptions  of  departments,  in  the  studies,  there  were  descriptions  of  four  departments  in  the  base,  and 
two  departments  in  the  target.  Half  of  the  causal  consequents  in  each  department  were  relevant  to  the  student  context, 

and  half  were  relevant  to  the  financial  office  context. 
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as  constraints  on  analogical  inference.  For  this 
purpose,  we  adapted  materials  used  by  Mark- 
man  (1997).  The  design  of  the  study  is  shown 
in  Figure  1 .  Participants  were  given  descrip¬ 
tions  of  departments  in  a  college.  The  descrip¬ 
tion  of  each  department  contained  a  causal  an¬ 
tecedent  (e.g.,  The  faculty  in  the  biology  de¬ 
partment  are  great  teachers)  and  two  conse¬ 
quents  following  from  that  antecedent  (e.g.,  stu¬ 
dents  [in  biology]  are  motivated  to  learn,  and 
faculty  [in  biology]  get  external  offers  and  leave 
the  university). 

The  descriptions  of  the  departments  were 
constructed  so  that  half  of  the  causal  consequents 
in  the  base  domain  were  most  relevant  to  stu¬ 
dents  at  the  university,  and  half  were  most  rele¬ 
vant  to  financial  officers.  All  materials  were  pre¬ 
tested  to  ensure  that  the  causal  consequents  were 
primarily  related  to  only  one  of  the  contexts. 

In  Experiment  1 ,  half  of  the  subjects  are  told 
at  the  beginning  of  the  study  that  they  are  play¬ 
ing  the  role  of  a  student  at  one  university  who  is 
about  to  transfer  to  a  second  university.  ITie  other 
half  of  the  subjects  are  told  that  they  are  playing 
the  role  of  a  financial  officer  at  one  university 
who  is  about  to  take  a  job  at  a  second  university. 
After  reading  the  descriptions  of  the  departments 
in  the  old  university  (the  base  domain),  they  are 
given  descriptions  of  two  departments  in  the  new 
university  (the  target  domain).  Subjects  are  told 
that  they  do  not  know  too  much  about  the  new 
university  yet,  because  they  are  just  arriving,  and 
they  are  asked  to  make  predictions  about  what 
to  expect  at  the  new  school  based  on  what  they 
know  about  the  old  school. 

Based  on  previous  research,  we  expect  that 
the  inferences  people  generate  will  generally 
reflect  shared  system  inferences  rather  than 
non  shared  system  inferences.  Further,  people 
should  tend  to  infer  information  that  is  relevant 
to  them.  That  is,  subjects  in  the  student  condi¬ 
tion  should  tend  to  infer  student-relevant  infor¬ 
mation,  while  subjects  in  the  financial  officer 
condition  should  generally  infer  financial  of¬ 
ficer-relevant  information. 

A  key  question  involves  which  constraint  will 
be  more  important  in  inference.  One  possibility 
is  that  people  will  make  primarily  shared  system 


inferences,  but  that  within  the  shared  system  in¬ 
ferences  made,  there  will  be  more  student-rele¬ 
vant  facts  inferred  by  people  given  the  student 
cover  story,  and  more  financial  officer-relevant 
facts  inferred  by  people  given  the  financial  offic¬ 
er  cover  story.  A  second  possibility  is  that  prag¬ 
matics  is  most  central.  On  this  view,  people  will 
focus  primarily  on  pragmatically  relevant  infor¬ 
mation  regardless  of  whether  it  is  a  shared  system 
fact  or  a  nonshared  system  fact. 

EXPERIMENT! 

Method 

Subjects 

Subjects  in  this  experiment  were  48  under¬ 
graduates  at  Columbia  University  (24/condi¬ 
tion),  who  were  paid  to  participate. 

Design 

The  main  dependent  variable  in  this  study 
is  the  number  of  inferences  made.  The  infer¬ 
ences  made  can  be  scored  as  Shared  or  Non¬ 
shared  system  inferences.  Half  of  the  facts  in 
the  base  domain  are  Student-relevant,  and  half 
are  Financial  Officer-relevant.  The  indepen¬ 
dent  variable  in  this  study  is  Cover  story,  which 
has  two  levels  (Student  and  Financial  Officer). 

Materials  and  Procedure 

The  experimental  materials  were  placed  in 
booklets.  The  booklets  began  with  instructions 
that  described  the  cover  stories.  In  the  student 
condition,  the  subject  was  told  that  they  were  a 
student  at  one  university  (Gordmont  Universi¬ 
ty)  and  that  they  were  transferring  to  a  second 
university  (Fallsburg  University).  In  the  finan¬ 
cial  officer  condition,  subjects  were  told  that 
they  were  a  financial  officer  who  worked  at 
Gordmont  University,  and  they  were  taking  a 
new  job  at  Fallsburg  University. 

After  the  cover  stories  were  the  descriptions 
of  four  departments  at  the  first  college.  As  sum¬ 
marized  in  Figure  1,  each  department  consist¬ 
ed  of  paragraphs  describing  a  fact  that  served 
as  a  causal  antecedent.  This  antecedent  caused 
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two  consequents.  One  consequent  was  pretest¬ 
ed  to  be  relevant  primarily  to  students,  and  the 
other  was  pretested  to  be  relevant  primarily  to 
financial  officers.  The  consequents  were 
judged  by  the  authors  to  be  plausible  conse¬ 
quences  of  the  causal  antecedent. 

After  reading  the  descriptions  of  the 
base  domain,  subjects  were  given  a  quiz. 
Markman  (1997)  used  a  similar  quiz  to  en¬ 
sure  that  subjects  actually  read  the  informa¬ 
tion  about  the  base  domain  carefully.  The 
quiz  had  one  question  relevant  to  each  causal 
consequent. 

Following  the  quiz,  subjects  were  shown 
the  descriptions  of  two  departments  at  the 
new  college.  These  descriptions  were  short¬ 
er,  and  contained  information  about  possible 
causal  antecedents,  without  any  information 
about  what  occurred  as  the  result  of  these  an¬ 
tecedents. 

After  reading  about  the  departments  at  the 
new  school,  people  were  asked  to  use  their  ex¬ 
perience  at  the  old  school  to  make  predictions 
about  what  might  happen  at  the  new  school. 
Subjects  were  encouraged  to  make  as  many 
predictions  as  they  wanted. '  Subjects  were  giv¬ 
en  only  one  booklet  for  this  class,  and  so  they 
could  go  back  and  look  at  the  base  domain  when 
making  inferences. 

Following  the  inference  task,  people  were 
asked  to  say  which  departments  in  the  old 
school  corresponded  to  each  department  in  the 
new  school.  No  specific  predictions  are  made 
about  performance  in  this  mapping  task,  and  it 
will  not  be  discussed  further  in  this  paper. 


'Unlike  the  studies  by  Markman  (1997),  the  question 
in  the  inference  task  was  open-ended.  In  previous  studies, 
subjects  were  asked  to  make  predictions  about  what  would 
happen  given  particular  facts  about  the  new  school.  This 
task  may  have  focused  people  on  specific  facts,  and  infiat- 
ed  the  Importance  of  shared  system  facts  connected  to  those 
causa!  antecedents.  Thus,  these  previous  studies  may  have 
overestimated  the  importance  of  shared  system  facts  in  in¬ 
ference.  The  open-ended  question  docs  not  focus  people  on 
particular  causal  antecedents,  and  so  docs  not  lead  to  the 
same  potential  bias.  Because  the  data  from  the  present  stud¬ 
ies  are  similar  to  those  of  previous  studies,  it  is  unlikely  that 
the  phrasing  of  the  question  in  those  studies  inflated  the 
importance  of  shared  system  facts. 


Results 

Inferences  were  coded  as  shared  system 
facts,  non  shared  system  facts  or  other.  To  be 
scored  as  a  shared  system  or  nonshared  sys¬ 
tem  inference,  the  subject  had  to  mention  a 
particular  causal  antecedent  and  a  fact  that 
followed  from  it.  Shared  system  inferences 
were  those  inferred  items  that  were  causal 
consequents  from  a  shared  causa!  antecedent. 
For  example,  inferring  that  faculty  will  get 
external  job  offers  and  leave  given  that  facul¬ 
ty  in  the  department  were  good  teachers  would 
be  a  shared  system  inference.  Non.shared  sys¬ 
tem  inferences  were  those  for  which  the  in¬ 
ferred  causal  consequent  was  not  connected 
to  a  matching  antecedent  from  the  base.  For 
example,  inferring  that  the  faculty  in  a  depart¬ 
ment  argue,  which  will  cause  them  to  get  out¬ 
side  offers  and  leave  would  be  a  nonshared 
system  inference.  All  other  inferences  were 
scored  as  other.  In  the  interest  of  space,  infer¬ 
ences  scored  as  other  will  not  be  discussed 
further  in  this  paper. 

Each  inference  was  also  scored  as  student 
relevant  or  financial  officer  relevant  These 
determinations  were  based  on  how  a  fact  was 
classified  based  on  the  pretests  described  above. 

The  data  were  analyzed  in  a  2  (shared  vs. 
nonshared  system  inference)  x  2  (student-rele¬ 
vant  vs.  financial  officer- relevant)  x  2  (Cover 
story)  mixed  model  ANOVA.  As  expected, 
people  made  more  shared  system  inferences 
(Af=2.3I)  than  nonshared  system  inferences 
(Af=0.54),  F(l,46)=:58.45,  p<.001.  The  only 
other  reliable  effect  was  an  expected  interac¬ 
tion  between  Cover  story  and  Relevance  of  fact, 
F(I,46)=5.25,  p<.05.  This  interaction  reflects 
that  subjects  given  the  student  cover  story  made 
inferences  of  significantly  more  student-rele¬ 
vant  facts  (Af=l  .75)  than  financial  officer-rele¬ 
vant  facts  (A/=1.08),  f(23)=2.56,  /?<.05  (Bon- 
ferroni).  In  contrast,  students  given  the  finan¬ 
cial  officer  cover  story  made  inference  of  more 
financial  officer-relevant  facts  (Af=l.50)  than 
student-relevant  facts  (A/=1.37),  although  this 
difference  was  not  significant,  r(23)=0.55, 
p>.10  (Bonferroni). 
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Discussion 

These  data  demonstrate  both  effects  of  sys- 
tematicity  and  pragmatic  relevance  on  analog¬ 
ical  inference.  First,  replicating  previous  re¬ 
search,  people  were  far  more  likely  to  infer 
shared  system  facts  than  to  infer  nonshared  sys¬ 
tem  facts.  In  addition,  they  were  more  likely  to 
give  information  that  was  relevant  to  their  cov¬ 
er  story  than  to  give  information  not  relevant 
to  their  cover  story. 

These  data  further  suggest  that  systematic- 
ity  is  a  stronger  constraint  on  inference  than  is 
pragmatics.  In  particular,  there  were  many 
shared  system  facts  inferred  that  were  not  rele¬ 
vant  to  a  subject’s  cover  story.  In  contrast,  there 
were  few  nonshared  system  facts  inferred  over¬ 
all.  Thus,  people  appear  to  filter  the  inferences 
they  make  first  by  focusing  on  shared  system 
facts.  Once  the  shared  system  facts  have  been 
found,  people  can  then  focus  more  selectively 
oh  those  relevant  to  their  goals. 

One  possible  explanation  for  why  the  influ¬ 
ence  of  structure  appeared  stronger  than  the  in¬ 
fluence  of  pragmatics  is  that  people  were  able  to 
look  back  at  the  base  domain  when  making  infer¬ 
ences.  This  explanation  assumes  that  one  impor¬ 
tant  role  of  pragmatic  goals  is  to  focus  people  on 
information  that  is  likely  to  be  relevant  when  faced 
with  a  heavy  memory  load.  Because  people  were 
able  to  look  back  at  the  base  and  target  domains 
in  Experiment  1,  there  was  no  significant  memo¬ 
ry  load.  Thus,  pragmatic  information  may  have 
been  less  useful  than  it  would  be  if  the  base  do¬ 
mains  were  in  memoiy. 

To  test  this  possibility,  we  repeated  Exper¬ 
iment  1,  except  that  there  were  two  booklets. 
One  contained  the  base  domain  and  the  quiz. 
The  other  contained  the  target  domain  and  the 
inference  and  mapping  tasks.  At  the  beginning 
of  the  study,  subjects  were  given  the  base  do¬ 
main  and  the  quiz.  After  completing  the  quiz, 
the  first  booklet  was  taken  away,  and  the  sec¬ 
ond  booklet  with  the  target  domain  and  the  in¬ 
ference  and  mapping  tasks  was  given.  Thus, 
subjects  had  to  recall  information  about  the  base 
domain  from  memory.  If  pragmatic  informa¬ 
tion  has  its  influence  primarily  on  memory,  then 


the  effects  of  pragmatics  relative  to  those  of 
structure  should  be  stronger  in  Experiment  2 
than  they  were  in  Experiment  1 .  Otherwise, 
the  data  are  expected  to  look  much  like  those 
of  Experiment  1. 

EXPERIMENT  2 
Method 

Subjects 

Subjects  in  this  study  were  48  members  of 
the  Columbia  University  community  who  were 
paid  for  their  participation. 

Materials,  Procedure,  and  Design 

The  materials,  procedure,  and  design  of  Ex¬ 
periment  2  were  identical  to  those  of  Experiment 
1  with  the  following  change.  The  booklets  gen¬ 
erated  in  Experiment  1  were  split  into  two  parts. 
The  first  part  contained  only  the  instructions  with 
the  cover  story,  the  description  of  the  base  do¬ 
main,  and  the  quiz.  The  second  part  contained 
the  description  of  the  target  domain,  the  infer¬ 
ence  task  and  the  mapping  task.  After  complet¬ 
ing  the  quiz  in  the  first  part,  subjects  had  to  turn 
in  the  first  booklet  in  order  to  receive  the  second 
and  to  complete  the  experiment. 

Results 

Once  again,  the  data  were  scored  as  shared 
system  facts  and  nonshared  system  facts,  in 
addition,  the  information  was  marked  as  stu¬ 
dent-relevant  or  financial  officer-relevant. 
Again,  the  data  were  analyzed  with  a  2  (shared 
vs.  nonshared  system  inference)  x  2  (student¬ 
relevant  vs.  financial  officer-relevant)  x  2  (Cov¬ 
er  story)  mixed  model  ANOVA. 

The  results  of  this  study  are  quite  similar 
to  those  of  Experiment  1 .  Once  again,  people 
made  more  shared  system  inferences  (A/=2.29) 
than  nonshared  system  inferences  (Af=0.58), 
F(l,46)=42.23,  p<.001 .  There  was  also  a  reli¬ 
able  interaction  between  Cover  story  and  Rele¬ 
vance  of  fact  F(1 ,46)=4. 10,  p<.05.  As  before, 
this  ANOVA  reflects  that  people  given  the  stu- 
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dent  cover  story  made  more  student-relevant 
inferences  (A/=i.46)  than  financial  officer-rel¬ 
evant  inferences  (Af=1.25)  and  people  in  the 
financial  office  condition  made  more  financial 
officer-relevant  inferences  (Af=1.75)  than  stu¬ 
dent-relevant  inferences  (Af=l  .29).  Neither  of 
these  simple  effects  was  reliable,  however. 

Finally,  there  was  a  marginally  significant 
interaction  between  Inference  type  and  Rele¬ 
vance,  F(l,46)=2.91,  .05<p<.10.  This  interac¬ 
tion  reflects  that,  collapsing  across  cover  stories, 
there  was  a  tendency  for  people  to  make  fewer 
shared  system  inferences  of  student-relevant 
facts  (Af=l  .04)  than  of  financial  officer-relevant 
facts  (Af=l  .25),  but  more  nonshared  system  in¬ 
ferences  of  student-relevant  facts  (Af=0.33)  than 
of  financial  officer-relevant  facts  (Af=0.25). 
Neither  of  these  simple  effects  is  reliable. 

Discussion 

The  results  of  Experiment  2  provide  addi¬ 
tional  evidence  that  systematicity  is  a  more 
powerful  constraint  on  analogical  inference 
than  is  pragmatics.  As  in  Experiment  2,  peo¬ 
ple  made  far  more  shared  system  inferences 
than  nonshared  system  inferences.  There  was 
a  tendency  for  people  to  make  inferences  of 
facts  related  to  their  pragmatic  goals,  but  this 
tendency  was  small  relative  to  the  influence  of 
systematicity. 

Experiment  2  extends  the  findings  of  Ex¬ 
periment  1,  because  people  could  not  look 
back  at  the  base  domain  when  making  infer¬ 
ences  in  this  study.  We  speculated  that  being 
able  to  look  back  at  the  base  domains  might 
have  decreased  the  influence  of  pragmatic 
goals.  In  contrast  to  this  speculation,  having 
to  access  the  base  domain  from  memory  did 
not  make  the  influence  of  pragmatic  goals  on 
inference  stronger. 

While  pragmatic  goals  have  a  weaker  in¬ 
fluence  on  analogical  inference  than  docs  sys¬ 
tematicity,  they  have  still  had  a  reliable  influ¬ 
ence  on  inferences  in  two  studies.  Thus,  it  is 
worth  considering  where  these  goals  have  their 
influence.  Spellman  and  Holyoak  (1996)  con¬ 
trasted  two  possible  influences  of  pragmatics. 


One  possibility  was  that  pragmatics  influenced 
pre-mapping  representational  processes.  On 
this  view,  information  about  domains  is  filtered 
by  pragmatic  goals,  and  only  goal-relevant  in¬ 
formation  is  stored.  Alternatively,  pragmatic 
goals  might  have  their  influence  during  the 
mapping  and  inference  phases. 

To  test  this  possibility,  Spellman  and  Ho¬ 
lyoak  (1996)  varied  when  in  the  experiment 
people  were  given  pragmatic  information.  It 
was  either  given  prior  to  the  presentation  of  the 
base  domain  (in  which  case  it  could  have  some 
influence  on  what  was  stored)  or  after  the  pre¬ 
sentation  of  the  base  domain  (in  which  case,  it 
could  not  have  influenced  what  was  stored). 
Regardless  of  when  pragmatic  information  was 
presented,  an  influence  of  pragmatic  goals  was 
found,  leading  Spellman  and  Holyoak  to  con¬ 
clude  that  pragmatics  has  its  influence  after  the 
domains  are  represented. 

We  performed  a  similar  study  with  our 
materials.  As  in  Experiment  2,  the  base  and 
target  domains  were  presented  in  separate  book¬ 
lets.  The  subjects  in  this  study  were  given  the 
cover  story  after  taking  the  quiz  and  receiving 
the  second  booklet.  This  group  of  subjects  had 
already  seen  the  base  domain,  and  so  the  prag¬ 
matic  information  could  not  influence  what  was 
learned  about  it. 

If,  as  Spellman  and  Holyoak  (1996)  sug¬ 
gested,  pragmatic  information  has  its  influence 
after  the  representation  process  is  completed, 
then  we  should  obtain  the  same  results  in  Ex¬ 
periment  3  that  were  observed  in  the  first  two 
studies.  In  contrast,  if  pragmatic  information 
has  its  influence  during  the  construction  of  rep¬ 
resentations,  then  the  influence  of  the  cover  sto¬ 
ry  should  be  eliminated  in  Experiment  3. 

EXPERIMENT  3 
Method 

Subjects 

Subjects  in  this  experiment  were  48  members 
of  the  Columbia  University  community  (24  per  con¬ 
dition)  who  were  paid  for  their  participation. 
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Materials,  Procedure,  and  Design 

This  experiment  was  identical  to  Experi¬ 
ment  2,  except  that  the  cover  story  was  present¬ 
ed  to  subjects  after  completing  the  quiz,  and 
before  the  read  about  the  target  domain. 

Results 

Once  again,  the  inferences  were  scored  as 
shared  system  facts,  nonshared  system  facts  or 
other  facts.  The  shared  and  nonshared  system 
facts  were  further  scored  as  student-relevant  or 
financial  officer-relevant.  The  data  were  ana¬ 
lyzed  with  a  2  (shared  vs.  nonshared  system 
inference)  x  2  (student-relevant  vs.  financial 
officer-relevant)  x  2  (Cover  story)  mixed  mod¬ 
el  ANOVA. 

As  in  Experiments  1  and  2,  people  made 
more  shared  system  inferences  overall 
(Af=2.15)  than  nonshared  system  inferences 
(M=0.75),  F(1 ,46)=63.77,p<.001 .  In  addition, 
as  in  Experiments  1  and  2,  there  was  a  signifi¬ 
cant  interaction  between  Cover  story  and  Rele¬ 
vance  of  fact,  F(l,46)=11.12,  p<.01.  This  in¬ 
teraction  reflects  that  subjects  given  the  student 
cover  story  made  more  student-relevant  infer¬ 
ences  (Af=1.96)  than  financial  officer-relevant 
inferences  (M=0.82),  r(23)=3.87,  p<.0\  (Bon- 
ferroni).  In  contrast,  subjects  given  the  finan¬ 
cial  officer  cover  story  made  more  financial 
officer-relevant  inferences  {M-\ .54)  than  stu¬ 
dent-relevant  inferences  (Af=1.46),  although 
this  difference  was  not  significant,  r(23)=0.39, 
p>.10  (Bonferroni). 

In  addition  to  these  expected  effects, 
there  were  also  two  unexpected  effects. 
There  was  a  main  effect  of  Relevance  of  fact, 
F(l,46)=8.27,/?<.01.  This  interaction  reflects 
that  overall  there  were  more  student-relevant 
inferences  (A/=1.71)  than  financial  officer 
relevant  inferences  (M=\ .19).  Finally,  there 
was  a  reliable  interaction  between  Cover  sto¬ 
ry  and  Shared  vs.  nonshared  system, 
F(l,46)=4.1  l,p<.05.  This  interaction  reflects 
that  people  tended  to  make  slightly  more 
shared  system  inferences  when  given  the  fi¬ 
nancial  officer  cover  story  (M=2.38)  than 
when  given  the  student  cover  story  (Af=l  .92), 


but  to  make  slightly  more  nonshared  system 
inferences  when  given  the  student  cover  sto¬ 
ry  (A/=0.88)  than  when  given  the  financial 
officer  cover  story  (A/=0.63).  Neither  of  these 
simple  effects  is  significant. 

Discussion 

The  results  of  Experiment  3  are  parallel 
to  those  of  Experiments  1  and  2.  Once  again, 
there  was  a  strong  tendency  for  people  to  in¬ 
fer  shared  system  facts  rather  than  nonshared 
system  facts.  In  addition,  people  were  also 
more  likely  to  infer  facts  relevant  to  the  cov¬ 
er  story  given  rather  than  facts  not  relevant 
to  the  cover  story. 

The  strong  influence  of  pragmatics  in  this 
experiment  suggests  that  pragmatic  information 
has  its  influence  after  the  representations  of  the 
domains  have  been  formed.  That  is,  people 
could  not  filter  out  information  about  the  base 
domain  using  their  goals,  because  these  goals 
were  not  presented  until  after  the  base  domains 
were  encoded.  This  pattern  of  data  is  sensible, 
because  people  often  cannot  know  their  goals 
in  advance.  Encoding  as  much  information  as 
possible  is  advantageous,  because  it  allows  in¬ 
formation  relevant  to  an  unforeseen  goal  to  in¬ 
fluence  cognitive  processing. 

An  unexpected  finding  in  this  study  was 
that  people  inferred  more  student-relevant 
facts  overall  than  financial  officer-relevant 
facts.  This  finding  may  reflect  an  influence 
of  background  knowledge  on  memory.  In  this 
study,  people  were  not  given  the  cover  story 
until  after  they  read  about  the  base  domain. 
Thus,  they  had  to  bring  their  own  experience 
to  bear  when  interpreting  the  base  domain. 
Because  all  of  the  participants  in  this  study 
were  students,  it  is  likely  that  they  found  the 
student-relevant  information  more  salient  or 
more  comprehensible  than  the  financial  offic¬ 
er-relevant  information.  This  suggestion  is 
compatible  with  a  variety  of  studies  demon¬ 
strating  that  memory  for  new  information  in 
familiar  domains  is  better  than  memory  for 
new  information  in  unfamiliar  domains 
(Bransford  &  Johnson,  1972,  1973). 
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GENERAL  DISCUSSION 

Analogical  inference  is  a  powerful  way  of 
extending  one  domain  based  on  its  similarity 
to  another.  Because  much  of  the  information 
about  a  base  domain  is  unlikely  to  be  true  in 
the  target  domain,  it  is  necessary  to  constrain 
the  inference  process.  The  best  constraints  are 
those  that  focus  people  on  the  information  in 
the  base  domain  that  is  most  likely  to  be  true 
about  the  target. 

Two  constraints  on  analogical  inference 
examined  here  were  systematicity  and  pragmat¬ 
ics.  Strong  support  for  the  influence  of  syste¬ 
maticity  was  obtained  in  these  studies,  as  sub¬ 
jects  inferred  far  more  shared  system  facts  than 
nonshared  system  facts.  Support  for  pragmat¬ 
ics  was  also  obtained,  as  people  were  generally 
more  likely  to  infer  facts  relevant  to  their  cover 
story.  This  influence  of  pragmatics  was  evi¬ 
dent  even  in  Experiment  3  where  the  goal  was 
not  provided  until  after  the  base  domain  was 
read.  This  finding  suggests  that  pragmatics  docs 
not  filter  out  information  during  encoding,  but 
rather  works  during  the  mapping  or  inference 
stage.  Further  research  will  have  to  pinpoint 
the  locus  of  the  effects  of  pragmatics. 

While  both  systematicity  and  pragmatics  had 
an  influence  on  analogical  inference,  systema¬ 
ticity  was  a  much  stronger  constraint  in  these 
studies.  People  generally  inferred  causal  conse¬ 
quents  that  were  related  to  matching  causal  an¬ 
tecedents.  Having  used  systematicity  to  con¬ 
strain  the  set  of  possible  inferences,  people  were 
then  somewhat  more  likely  to  infer  information 
relevant  to  their  cover  story.  However,  in  all 
conditions,  there  were  still  many  inferences  of 
information  not  relevant  to  the  cover  story.  It  is 
possible  that  the  effects  of  pragmatics  would  be 
stronger  if  the  consequences  for  failing  to  achieve 
the  goal  were  more  dire.  In  the  present  experi¬ 
ments,  people  were  simply  given  a  cover  story, 
but  were  not  rewarded  selectively  for  inferences 
relevant  to  their  cover  story,  or  penalized  for  in¬ 
ferences  not  relevant  to  their  cover  story.  None¬ 
theless,  the  present  results  strongly  suggest  that 
systematicity  is  a  more  powerful  constraint  on 
inference  than  is  pragmatic  relevance. 


impUcatiom  for  Computational  Models 

There  are  a  number  of  comprehensive  mod¬ 
els  of  analogical  reasoning,  and  all  of  them  have 
mechanisms  for  generating  analogical  inferenc¬ 
es.*  In  this  discussion,  we  focus  on  three  prom¬ 
inent  models:  SME  (Falkenhaincr,  Forbus,  Si 
Centner,  1989),  LISA  (Hummel  &  Holyoak, 
1997),  and  lAM  (Keane,  Ledgeway,  &  Duff, 
1994).*  This  discussion  will  assume  a  basic 
familiarity  with  these  models. 

The  SME  model  assumes  that  candidate 
inferences  involve  carrying  over  facts  from  the 
base  domain  that  arc  connected  to  matching 
systems.  Thus,  SME  is  consistent  with  the  ob¬ 
served  use  of  systematicity  to  constrain  ana¬ 
logical  matches.  SME  has  been  extended  to 
incorporate  pragmatic  information  as  well  (Eor- 
bus  &  Oblinger,  1990).  This  extension  marks 
pragmatically  relevant  representational  ele¬ 
ments,  and  then  attempts  to  use  the  goal  rele¬ 
vant  information  in  the  preferred  mapping,  and 
in  the  candidate  inferences  generated.  This  use 
of  pragmatics  is  consistent  with  the  idea  that 
systematicity  is  a  stronger  constraint  on  ana¬ 
logical  inference  than  pragmatics. 

There  are  two  ways  in  which  SME  has  dif¬ 
ficulty  explaining  the  present  data.  First,  the 
implementation  of  pragmatic  marking  in  SME 
is  too  strong.  In  the  data,  pragmatic  information 
appears  to  provide  a  small  increase  in  the  sa¬ 
lience  of  facts  relevant  to  the  goal.  In  contrast, 
SME  will  cany  over  every  marlced  fact  that  is 
structurally  consistent  with  the  match  between 
base  and  target.  Thus,  in  order  to  allow  SME 
(with  pragmatic  marking)  to  account  for  the 
present  data,  some  mechanism  must  be  estab¬ 
lished  to  determined  how  nodes  arc  given  prag- 

•Thcrc  arc  aKo  many  spcciali/cd  models  that  do  noi 
incorporate  analogical  inference  mcchnnisms  For  exam¬ 
ple,  Halford  ct  al  *8  STAR  model  ten'ior  products  in  a 
connecfionist  model  to  do  A:B::C:D  analogies  (Halford. 
Wilson,  Guo,  Wiles,  &  Stewart.  1994)  Candidate  Infer¬ 
ences  arc  not  needed  to  solve  this  type  of  analogy  prohlem 

’Holyoak  and  Thagard’s  (IQKOACMF.  is  not  consid¬ 
ered  here  Its  candidate  Inference  mechanism  is  not  con¬ 
strained  by  systematicity,  and  has  difTrculty  making  infer¬ 
ences  when  there  arc  potential  mnny-to-one  matches  (Mark- 
man,  1997). 


198 


Structure  and  Pragmatics  in  Analogical  Inference 


matic  marking.  This  account  would  have  to 
sume  that  not  all  representational  elements  that 
are  goal-relevant  get  marked,  or  that  not  all  rele¬ 
vant  facts  are  posited  as  candidate  inferences. 

Second,  some  information  that  is  not  goal¬ 
relevant  is  also  inferred  by  subjects  in  the 
present  studies.  Thus,  the  pragmatic  marking 
account  must  also  explain  why  some  (but  not 
all)  non-goal-relevant  facts  are  inferred. 

The  lAM  model  generates  candidate  in¬ 
ferences  by  completing  partially  matching 
systems.  This  assumption  constrains  I  AM  to 
infer  only  shared  system  facts.  Pragmatic  in¬ 
formation  influences  lAM  by  determining 
which  predicates  are  used  for  the  match  be¬ 
tween  base  and  target.  Goal-relevant  predi¬ 
cates  are  more  likely  than  non-goal-relevant 
predicates  to  be  selected  at  the  early  stages 
of  the  match  process  to  be  parts  of  the  corre¬ 
spondence.  In  the  end,  however,  lAM  gen¬ 
erates  a  match  that  includes  both  relevant  and 
irrelevant  matches  in  a  situation  like  the  one 
in  the  present  studies,  because  both  the  rele¬ 
vant  and  irrelevant  information  can  be  incor¬ 
porated  into  a  structurally  consistent  match. 
Thus,  lAM  cannot  explain  why  goal -relevant 
inferences  were  more  common  than  non¬ 
goal-relevant  inferences. 

Finally,  it  is  not  clear  what  LISA  predicts 
for  this  task.  Comprehensive  tests  of  the  candi¬ 
date  inference  mechanism  in  LISA  model  have 
not  yet  been  published,  but  Hummel  (personal 
communication)  suggests  that  LISA  exhibits  a 
preference  for  shared  system  facts  over  non- 
shared  system  facts  in  analogy.  There  are  a  num¬ 
ber  of  ways  that  pragmatic  information  could  be 
incorporated  into  LISA.  Relational  bindings  are 
represented  in  this  model  by  having  nodes  that 
correspond  to  predicates,  relational  roles,  and 
arguments  to  those  relations  fire  in  phase  with 
one  another,  and  out  of  phase  with  nodes  repre¬ 
senting  other  relational  bindings.  Pragmatically 
relevant  bindings  can  be  fired  more  often  than 
pragmatically  irrelevant  bindings.  A  mechanism 
like  this  would  help  ensure  that  pragmatically 
relevant  information  is  incorporated  in  the  map¬ 
ping  that  is  generated.  It  is  possible  that  this 
mechanism  would  also  lead  to  more  goal-rele¬ 


vant  inferences  than  non-goal-relevant  inferenc¬ 
es.  At  this  time,  however,  it  is  not  possible  to 
make  any  firm  predictions. 

CONCLUSIONS 

The  three  experiments  in  this  paper  dem¬ 
onstrate  that  systematicity  and  pragmatics  are 
important  constraints  on  analogical  inference. 
Shared  system  facts  are  more  frequently  in¬ 
ferred  than  nonshared  system  facts.  Likewise, 
goal-relevant  facts  are  more  frequently  inferred 
than  non-goal-relevant  facts.  Further,  system¬ 
aticity  appears  to  be  a  more  powerful  constraint 
on  mapping  than  is  pragmatics. 

Currently,  none  of  the  comprehensive 
computational  models  accounts  for  all  of  the 
data.  All  of  the  models  have  mechanisms  for 
implementing  both  systematicity  and  pragmat¬ 
ics.  However,  SME  cannot  account  for  why 
some  goal-relevant  facts  are  not  inferred,  while 
some  non-goal-relevant  facts  are  inferred. 
lAM  cannot  account  for  why  there  is  a  differ¬ 
ence  in  the  number  of  goal -relevant  and  non¬ 
goal-relevant  facts  inferred.  Finally,  LISA 
exhibits  a  preference  for  shared  system  facts, 
but  its  inference  mechanism  has  not  been  spec¬ 
ified  to  the  point  where  it  can  make  specific 
predictions  about  the  role  of  pragmatic  infor¬ 
mation  in  inference.  Further  research  on  these 
computational  models  will  have  to  address 
these  shortcomings. 
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ABSTRACT 

Centner  &  Holyoak  (1997)  pointed  out  that 
there  has  been  convergence  between  theories 
of  analogy.  However,  the  role  of  pragmatics  in 
analogy  appears  to  still  divide  theories.  The  ef¬ 
fect  of  pragmatics  on  the  speed  of  analogical 
problem  solving  was  investigated  using  highly 
simplified  chess  problems.  The  pragmatic  fac¬ 
tor  of  goals  was  manipulated  by  instructing  par¬ 
ticipants  to  make  an  attacking  or  defensive 
move.  Participants  received  training  problems, 
followed  by  a  set  of  testing  problems  which 
were  solvable  by  analogical  transfer  from  a 
training  problem.  It  was  found  that  presenting 
the  same  goal  at  test  as  was  given  in  training 
for  a  maneuver  led  to  faster  solutions,  but  the 
effect  of  piece  similarity  (which  determined 
structural  similarity)  interacted  with  goal  simi¬ 
larity.  Piece  similarity  helped  when  the  prag¬ 
matics  were  consistent,  but  when  the  pragmat¬ 
ics  were  inconsistent,  other  forms  of  similarity 
had  no  effect.  This  supports  theories  in  which 
pragmatics  acts  as  a  strong  filter  for  analogies, 
rather  than  an  attenuated  filter. 

PRAGMATICS  AND  ANALOGICAL 
PROBLEM  SOLVING 

Centner  and  Holyoak  (1997)  pointed  out  that 
a  consensus  as  to  the  nature  of  analogical  reason¬ 
ing  has  emerged.  However,  in  die  companion  piec¬ 
es  introduced  by  Centner  and  Holyoak  (i.e.,  Cen¬ 
tner  &  Markman,  1997;  Holyoak  &  Thagard, 
1997),  a  stark  difference  is  apparent:  pragmatics 
is  emphasized  by  Holyoak  and  Thagard,  but  ig¬ 
nored  by  Centner  and  Markman.  The  role  of  prag- 


piatics  appears  to  remain  a  point  of  dispute  be¬ 
tween  theories  of  analogical  reasoning. 

According  to  Holyoak  and  Thagard  (1989), 
the  pragmatics  of  an  analogy  are  the  goals  and 
purpose  of  the  analogist.  The  context  may  pro¬ 
vide  such  pragmatics,  or  they  may  be  bought  by 
the  analogist  to  the  situation,  either  way  they  will 
influence  what  analogies  may  be  formed.  It  is 
not  disputed  that  pragmatics  are  important  for 
analogical  reasoning,  but  how  is.  Pragmatics 
were  implemented  in  Holyoak  and  Thagard ’s 
ACME  computer  model  as  providing  emphasis 
for  important  mappings  or  elements  of  an  ana¬ 
log.  In  contrast  to  pragmatics  affecting  the  pro¬ 
cess  of  analogical  mapping.  Centner  (1989)  ar¬ 
gued  that  pragmatics  could  have  an  influence 
before  processing,  by  changing  the  representa¬ 
tion  of  the  analogs;  alternatively,  pragmatics 
could  have  an  influence  after  processing,  by  caus¬ 
ing  the  rejection  of  analogies;  but  pragmatics 
have  no  independent  effect  during  processing. 
In  the  implementation  of  Centner’s  ideas  in  SME 
(see  Falkenhainer,  Forbus  &  Centner,  1989)  an¬ 
alogical  processes  are  driven  by  structural  and 
semantic  factors  alone. 

Spellman  and  Holyoak  (1996)  found  evi¬ 
dence  that  pragmatics  influenced  the  process 
of  analogical  mapping  by  showing  ihdiXprocess 
goals  (i.e.,  the  goals  of  the  reasoner  rather  than 
those  contained  within  the  analogs)  influenced 
the  mappings  people  made.  In  particular,  prag¬ 
matics  did  not  filter  out  all  goal-irrelevant  in¬ 
formation,  as  it  would  if  pragmatics  selected 
the  relevant  parts  of  the  source  and  target  as 
input  to  the  mapping  process.  Rather  than  be¬ 
ing  a  strong  filter,  as  attention  was  in  Broad- 
bent’s  (1958)  selective  attention  model,  prag- 
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matics  instead  was  an  attenuated  filter,  as  in 
Treisman’s  (1964)  alternative  to  Broadbent’s 
model.  Such  an  attenuated  filter  does  not  com¬ 
pletely  block  out  the  information  it  filters. 

Therefore,  Spellman  and  Holyoak  (1996) 
derived  two  testable  hypotheses,  the  filter  hy¬ 
pothesis  and  the  filter-attenuation  hypothesis. 
Fundamentally  Spellman  and  Holyoak ’s  argu¬ 
ment  appears  to  make  predictions  about  how 
pragmatics  would  interact  with  other  factors 
such  as  semantic  similarity  and  structural  con¬ 
sistency.  If  pragmatics  are  a  strong  filter,  then 
other  factors  should  have  no  influence  on  ana¬ 
logical  success  when  the  pragmatics  arc  wrong. 
If  pragmatics  are  an  attenuated  filter,  then  oth¬ 
er  factors  should  influence  success  even  when 
the  goal  is  wrong.  Therefore,  the  filter  hypoth¬ 
esis  (here  referred  to  as  the  strong  filter  hypoth¬ 
esis)  could  be  contrasted  with  the  attenuated 
filter  hypothesis  by  crossing  pragmatics  with 
other  factors  experimentally. 

If  the  argument  over  the  role  of  pragmatics 
is  over  its  effect  on  processes,  then  response 
times  may  be  a  particularly  appropriate  depen¬ 
dent  measure.  Many  investigations  of  cognitive 
processes  have  used  response  times  (see  Pos¬ 
ner,  1986),  yet  response  time  has  rarely  been 
used  to  investigate  analogical  processes.  Klein 
(1986)  argued  that  speed  should  be  an  advan¬ 
tage  of  analogical  thinking,  but  did  not  directly 
test  this  idea.  Thus  using  response  time  as  a 
dependent  measure  allowed  the  validity  of 
speed  as  measure  of  analogical  problem  solv¬ 
ing  to  be  examined,  and  opened  up  the  possi¬ 
bility  of  gaining  insight  into  analogical  prob¬ 
lem  solving  as  a  process. 

CRITERIA  FOR  I NVESTIGATING 
PRAGMATICS 

It  is  inherently  difficult  to  explore  what 
happens  during  a  cognitive  process,  and  explor¬ 
ing  the  role  of  pragmatics  in  analogical  process¬ 
es  raises  a  unique  set  of  problems.  Spellman 
and  Holyoak  (1996)  proposed  three  criteria  that 
need  to  be  fulfilled. 

I,  The  pragmatic  constraints  must  not  be 
reducible  to  other  general  constraints.  If  goals 
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simply  form  parts  of  the  structure  of  an  analog, 
then  their  influence  could  be  explained  as  a 
special  case  of  the  influence  of  structural  and/ 
or  semantic  constraints.  In  that  case  pragmat¬ 
ics  would  be  just  like  any  other  shared  repre¬ 
sentational  component.  To  clarify  this  issue, 
Spellman  and  Holyoak  (1996)  distinguished 
between  static  and  proce,^sing  goals.  For  ex¬ 
ample,  in  mapping  the  1991  Persian  Gulf  War 
to  World  War  II,  Hitler’s  goal  of  taking  over 
Europe  could  be  mapped  to  Saddam  Hussein’s 
assumed  goal  of  taking  over  the  Persian  Gulf 
This  would  be  a  mapping  of  static  goals  inter¬ 
nal  to  the  analogs.  Spellman  and  Holyoak  ar¬ 
gue  that  the  Bush  administration  promoted  the 
mapping  between  the  Persian  Gulf  crisis  to 
World  War  II  in  order  to  achieve  an  external 
goal:  military  intervention  by  the  United  States 
in  the  Persian  Gulf 

However,  the  distinction  between  a  pro¬ 
cessing  goal  and  static  goal  is  problematic.  One 
problem  acknowledged  by  Spellman  and  Ho- 
lyoak  (1996),  is  that  it  is  difficult  to  rule  out 
that  a  processing  goal  is  immediately  convert¬ 
ed  into  a  static  goal  once  it  is  given.  A  further 
problem  is  that  it  may  not  be  clear  which  goals 
are  internal  or  external  to  an  analog.  For  exam¬ 
ple,  it  could  be  argued  that  the  Bush’s  military 
intervention  goal  in  1991  was  a  static  goals  in¬ 
ternal  to  the  analogs.  The  World  War  11  analog 
could  already  have  a  military  intervention  goal 
embedded  within  it,  as  a  result  of  the  perceived 
failure  of  appeasement  before  World  War  II. 
Thus  it  could  be  argued  that  the  pre-World  War 
II  analog  was  retrieved  by  the  Bush  adminis¬ 
tration  because  people  represented  it  with  a  stat¬ 
ic  goal  that  mapped  to  Bush’s  own  static  goal 
for  the  Gulf  War  crisis,  that  the  United  States 
should  intervene. 

2.  The  pragmatic  effects  should  not  be  at¬ 
tributable  to  post-mapping  processes.  When 
analogies  are  used  to  solve  problems,  as  they 
have  been  in  many  studies  of  analogy,  then  pro¬ 
cesses  after  the  mapping  process  arc  required. 
Mappings  which  violate  the  goals  will  be  reject¬ 
ed.  To  avoid  this  problem,  Spellman  and  Ho¬ 
lyoak  (1996)  focus  on  the  actual  mappings  peo¬ 
ple  make  rather  than  what  they  do  with  that  map- 
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ping.  However,  even  mapping  tasks  are  vulner¬ 
able  to  post-mapping  processes,  because  not  all 
mappings  can  be  recorded  simultaneously. 

3.  The  pragmatic  ejfects  should  not  be  at¬ 
tributable  to  pre-mapping  processes.  Pragmat¬ 
ics  may  change  the  representation  of  analogs 
before  the  mapping  process  begins.  Establish¬ 
ing  that  mapping  depends  on  goals  given  after 
initial  representation  was  the  major  purpose  of 
the  experiments  by  Spellman  and  Holyoak 
(1996).  By  finding  empirical  support  for  the 
filter-attenuation  hypothesis,  they  found  sup¬ 
port  for  the  claim  that  pragmatics  affects  the 
process  of  analogical  mapping. 

Meeting  these  criteria.  In  this  experiment 
an  analogical  problem  solving  task  was  used  to 
try  and  meet  the  above  criteria.  This  was  partly 
because  speed  was  used  as  the  measure  of  ana¬ 
logical  reasoning  rather  than  success,  in  which 
case  a  problem  solving  task  was  more  appro¬ 
priate  than  a  mapping  task  (it  is  more  likely  to 
have  a  clear  and  definite  end).  Problem  solving 
also  has  an  advantage  in  that  it  has  a  very  clear 
processing  goal:  solve  the  problem  by  achiev¬ 
ing  the  specified  goal.  This  processing  goal  has 
a  definite  and  consistent  focus  on  mapping  the 
goals  of  the  source  analog.  Therefore,  rather 
than  manipulating  processing  goals,  an  alter¬ 
native  way  to  the  investigate  the  effects  of  pro¬ 
cessing  goals  is  to  maintain  a  consistent  pro¬ 
cessing  goal,  but  manipulate  the  nature  of  the 
arguably  static  goals  that  it  focuses  on.  This 
manipulation  would  allow  the  same  hypothe¬ 
ses  to  be  tested  as  when  the  processing  goals 
are  manipulated. 

For  minimizing  the  influence  of  post-map- 
ping  processes,  problem  solving  tasks  for  which 
it  is  obvious  if  the  solution  is  correct,  could  be 
particularly  appropriate.  Clear  solutions,  that 
require  no  modification,  are  less  likely  to  in¬ 
voke  post-mapping  processes. 

THE  TASK 

To  determine  the  effects  on  analogical 
problem  solving  of  pragmatics,  required  a 
problem  solving  task  with  goals  that  were 
easy  to  manipulate  without  affecting  other 


factors.  Such  a  task  is  chess  which  contains  a 
clear  distinction  between  the  goals  of  attack 
and  defend.  Just  as  importantly,  in  chess  the 
exact  same  configuration  of  chess  pieces  can 
be  approached  by  a  player  as  a  position  in 
which  an  attack  should  be  launched  (i.e.,  an 
attempt  should  be  made  to  capture  opposing 
pieces  or  to  gain  a  more  favorable  position), 
or  as  one  to  be  defended  (i.e.,  your  own  piec¬ 
es  should  be  protected,  or  your  position 
should  not  be  allowed  to  deteriorate).  Thus 
chess  has  clear  pragmatics  that  can  be  ma¬ 
nipulated  independent  of  the  structure  of  a 
position  (i.e.,  the  relationships  between  piec¬ 
es)  and  its  semantic  components  (i.e.,  the 
actual  pieces  themselves).  So  the  problems 
consisted  of  highly  simplified  chess  posi¬ 
tions.  For  each  problem,  participants  were 
presented  with  a  chess  board  on  which  were 
placed  two  defender  chess  pieces.  One  attack¬ 
er  piece  was  presented  off  the  board,  waiting 
to  be  placed  on  the  board  (see  Figures  la  and 
Ic  for  examples  of  exactly  of  what  partici¬ 
pants  saw). 

When  the  goal  was  attack,  participants 
solved  the  problem  by  placing  the  attacker 
piece  onto  the  board  so  as  to  guarantee  that 
the  attacker  piece  would  be  able  to  capture 
one  of  the  defender  pieces  on  its  next  move 
(after  one  of  the  defender  pieces  had  had  the 
opportunity  to  make  one  move,  just  like  in 
normal  chess).  Example  solutions  for  the 
problems  in  Figures  la  and  Ic  are  shown  in 
Figures  lb  and  Id  respectively.  Identical  po¬ 
sitions  were  given  when  the  goal  was  defend, 
but  the  problem  task  was  the  opposite:  the 
participant  had  to  avoid  the  capture  of  a  de¬ 
fender.  The  participant  legally  moved  one  of 
the  two  defender  pieces  in  anticipation  of  the 
attacker  piece  being  placed  onto  the  board 
(e.g.,  Figures  2a  and  2c).  This  attacker  piece 
would  be  anticipated  to  make  an  attacking 
maneuver,  exactly  like  the  one  that  the  par- 
icipants  would  have  made  if  they  had  the  at¬ 
tack  goal.  For  example,  Figures  2b  and  2d 
show  solutions  to  the  problems  shown  in  Fig¬ 
ures  2a  and  2c  respectively.  The  defend  goal 
thus  incorporated  the  attack  goal. 
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A)  RFA  ( problem  ) 


B)RFA  (solution) 


A)  RPD(  problem) 


B)  RPD  (solution  ) 


D)  BP  A  (solution  ) 
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Figure  I.  Examples  of  attack  goal  prohfems:  (a)  k  a 
rook  fork  and  (c)  a  bishop  pin,  each  with  the  attacker 
piece  (white)  shown  above  the  board  and  two  defenders 
pieces  (biack)  on  the  board.  Solutions  to  these  probiems 
are  indicated  in  (b)  and  (d)  respectively,  which  show 
successfui  placement  of  the  attacker. 


Figure  2.  Examples  of  defend  goal  problems:  (a)  is  a 
rook  pin  and  (c)  a  bishop  fork,  each  with  the  attacker 
(black)  shown  above  the  board  and  two  defenders 
(white)  on  the  board,  in  (b)  and  (d)  are  solutions  to 
these  problems,  each  a  successful  move  of  a  defender 
(an  open  circle  and  line  show  where  the  piece  moved 
from).  Also  shown  Is  where  the  attacker  was  threatening 
to  go  (indicated  by  the  solid  circle  and  dashed  line). 


To  help  participants  solve  these  prob¬ 
lems,  they  were  trained  on  two  simple  chess 
tactics  known  as  pin.^  and  forks.  Figure  lb 
illustrates  a  fork  solution:  an  attacker  piece 
is  placed  so  as  to  simultaneously  attack  two 
defender  pieces.  Because  only  one  defender 
piece  can  be  moved  at  a  time,  only  one  de¬ 
fender  will  be  able  to  escape  the  attack,  leav¬ 
ing  the  other  to  be  captured.  Figure  Id  illus¬ 
trates  a  pin  solution:  the  attacker  was  placed 
such  that  only  one  defender  was  directly 
threatened,  but  if  this  first  defender  moved 
away  then  the  defender  behind  it  could  be 
captured.  Hence,  a  capture  was  guaranteed. 

When  the  goal  was  defend  the  participant 
had  to  anticipate  that  the  opponent  was  about 
to  place  the  attacker  piece  onto  a  square  from 
which  it  could  execute  a  pin  or  a  fork.  The  par¬ 


ticipant  must  move  one  of  the  defender  pieces 
so  that  no  matter  where  the  attacker  piece  was 
placed,  it  could  not  execute  a  pin  or  fork.  An 
example  of  a  problem  requiring  defense  against 
a  rook  pin  is  illustrated  in  Figure  2a,  and  a  suc¬ 
cessful  solution  is  shown  in  Figure  2b  (which 
also  illustrates  the  threat).  Figure  2c  illustrates 
a  defend  goal  when  a  fork  by  a  bishop  was 
threatened,  and  Figure  2d  is  a  solution. 

Defense  and  attack  arc  closely  related  in 
these  problems  as  identical  configurations  of 
pieces  could  be  used  for  both  goals.  Knowing 
how  to  successfully  attack  should  help  with 
achieving  the  defend  goal,  as  the  specific  at¬ 
tack  solution  was  what  had  to  be  defended 
against.  Similarly,  knowing  how  to  defend  a 
position  implies  knowing  how  to  attack  that 
same  position. 
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General  methodology 

Participants  received  training  on  achieving 
both  attack  and  defend  goals,  and  training  on  how 
to  execute  pins  and  forks.  They  were  then  tested 
by  being  presented  with  problems  that  required 
a  pin  or  fork  solution,  but  varied  in  whether  they 
had  the  same  goal  or  used  the  same  attacker  piece 
as  did  the  participant’s  training  for  pins. 

There  were  eight  basic  training  positions 
possible,  each  a  combinations  of  the  three  two- 
level  factors  of  bishop/rook  attacker,  pin/fork 
solution,  and  attack/defend  goal.  These  positions 
will  be  referred  to  by  combinations  of  their  ini* 
tials:  a  bishop-pin-attack  problem  will  be  re¬ 
ferred  to  as  BPA,  a  bishop-pin-defense  as  BPD, 
a  bishop-fork-attack  as  BFA,  a  bishop-fork-de¬ 
fense  as  BFD,  a  rook-pin-attack  as  RPA,  a  rook- 
pin-defense  as  RPD,  a  rook-fork-attack  as  RFA, 
and  a  rook-fork-defense  as  RFD.  Participants 
were  trained  on  one  of  the  four  pin  problems. 

The  two  independent  variables  of  interest 
were  goal  change  (i.e.,  pragmatics)  and  piece 
change  (i.e.,  structure).  Relative  to  a  player’s  own 
training  on  a  certain  solution  type  (i.e.,  pin  or 
fork),  all  test  positions  could  be  considered  ei¬ 
ther  the  present  or  absence  of  change  along  ei¬ 
ther  or  both  of  these  dimensions.  A  goal  change 
was  defined  as  a  problem  with  the  opposite  goal 
to  the  pin  training  problem.  A  piece  change  was 
defined  as  changing  the  attacker  piece,  from  a 
rook  to  a  bishop.  Changing  the  attacker  required 
changing  the  relationship  between  the  defender 
pieces  and  required  thinking  about  the  problem 
in  a  different  way  because  of  the  different  ways 
that  rooks  and  bishops  move,  thus  it  changed  the 
structure  of  the  problem. 

Crossing  the  two  variables  yielded  four  dif¬ 
ferent  types  of  problems:  no-change,  problems 
that  used  the  same  goal  and  attacker  piece  as  a 
participant’s  pin  training  problem;  change- 
piece,  problems  with  the  same  goal,  but  differ¬ 
ent  attacker  piece;  change-goal,  same  attacker 
piece,  different  goal;  and  change-both,  prob¬ 
lems  with  a  different  goal  and  attacker  piece 
from  the  training  problem.  For  example,  a  BPA 
test  problem  was  a  change-goal  problem  if  the 
participant  received  BPD  training,  but  a  change- 


piece  problem  if  the  participant  received  RPA 
training.  Thus,  because  different  participants 
received  different  training  problems,  every  spe¬ 
cific  test  problem  was  classified  into  one  of 
these  four  types,  depending  on  a  participant’s 
specific  pin  training.  Thus,  the  design  was  com¬ 
pletely  within-subject  and  the  effects  of  differ¬ 
ences  in  the  difficulty  of  specific  problems  was 
eliminated  by  having  equal  numbers  of  partic¬ 
ipants  experience  each  type  of  training. 

In  order  to  increase  the  number  of  testing 
problems  and  to  examine  the  effects  of  surface 
changes  to  the  problems,  a  third  type  of  change 
was  applied  to  problems  independent  of  the 
piece  and  goal  changes:  Surface  transforms, 
which  were  transformations  of  the  training 
problems  involving  changing  the  placement  or 
nature  of  defender  pieces. 

Predictions 

Both  the  strong  filter  and  attenuated-filter 
hypotheses  would  predict  that  a  pin  problem  with 
the  same  goal  as  the  pin  training  problem  should 
be  solved  faster  than  when  the  problem  had  a  dif¬ 
ferent  goal.  Structural  changes  from  the  pin  train¬ 
ing  problem  should  also  be  responded  to  slower. 
However,  how  surface  and  structural  changes  in¬ 
teract  with  goal  changes  was  the  critical  question 
with  regard  to  testing  the  predictions  of  the  strong- 
and  attenuated-filter  hypotheses. 

The  strong  filter  hypothesis  would  appear 
to  predict  that  the  goal  and  structure  changes 
should  interact,  such  that  when  a  problem  had 
the  same  goal  as  the  training  problem  requiring 
the  same  solution,  then  having  similar  struc¬ 
ture  should  further  speed  responding.  In  con¬ 
trast,  when  the  goal  is  different,  the  strong  fil¬ 
ter  should  render  other  factors  irrelevant.  Thus 
problem  solvers  with  the  wrong  goal  would  not 
be  helped  by  similar  structure  or  surface  fea¬ 
tures,  because  they  would  be  searching  the 
wrong  part  of  memory  (see  Schank,  1982)  or 
because  pragmatics  had  an  effect  outside  the 
process  of  analogical  mapping  (see  Centner, 
1989).  Therefore,  when  the  problem  has  the 
wrong  goal,  neither  structure  nor  surface  simi¬ 
larity  should  affect  the  speed  of  solutions. 
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The  filter-attenuation  hypothesis,  suggest¬ 
ed  by  Holyoak  and  Thagard*s{1989)  multicon¬ 
straint  theory,  could  be  consistent  with  either 
the  presence  or  absence  of  an  interaction  be¬ 
tween  goal  and  structure  manipulation.  The  crit¬ 
ical  prediction  of  this  hypothesis  was  that  oth¬ 
er  factors  should  continue  to  have  an  effect  even 
when  the  goals  were  wrong.  If  pragmatics  are 
part  of  the  process  of  analogical  mapping,  then 
it  would  be  expected  that  structural  features 
would  continue  to  influence  problem  solving 
even  when  the  goal  was  wrong.  Therefore,  the 
strong  filter  and  filter-attenuation  hypotheses 
make  contrasting  predictions  for  the  effects  of 
structure,  when  the  goals  of  the  source  and  tar¬ 
get  analogs  do  not  match. 

AN  EMPIRICAL  TEST 
Method 

Participants.  A  total  of  108  participants 
(87  male  and  21  female),  with  27  in  each  pin 
training  group  were  drawn  from  the  introduc¬ 
tory  psychology  participant  pool  at  University 
of  California,  Los  Angeles. 

Apparatus.  An  Apollo  series  4000  work¬ 
station  with  a  19  in.  color  monitor  and  a  three- 
button  mouse  was  utilized.  A  program  devel¬ 
oped  by  the  author  controlled  the  experiment 
and  response  times  were  measured  with  an  ac¬ 
curacy  of  one  second. 

Materials.  The  pin  training  problems  were 
similar  to  those  shown  in  Figures  1  c  and  2a,  ex¬ 
cept  that  both  attacking  and  defending  versions 
of  either  could  be  given.  This  yielded  four  train¬ 
ing  problems.  An  additional  change  was  that  the 
non-king  defender  was  always  a  knight,  rather 
than  varying  with  the  attacker  piece.  The  same 
type  of  forlc  training  problem  was  given  to  all 
participants.  This  problem  used  a  knight  as  the 
attacker  and  the  defenders  were  a  king  and  a 
queen.  Thus,  the  fork  training  problems  had  min¬ 
imum  similarity  with  any  of  the  pin  problems. 

Forty-two  testing  problems  were  given,  all 
of  which  were  solvable  with  a  pin  or  a  fork.  Most 
of  these  were  based  on  the  four  pin  training  prob¬ 
lems  specified,  as  well  as  on  the  four  possible 


fork  training  problems  not  given  in  this  experi¬ 
ment.  However,  one  of  four  transforms  were 
applied:  the  identity  transform,  was  identical  to 
a  training  problem;  the  rotate  transform,  rotated 
a  training  problem  by  90’;  the  defender  trans¬ 
form,  changed  the  non-king  defender  into  the 
opposite  type  of  piece  to  the  attacker  piece  (i.e., 
if  a  rook  was  the  attacker,  then  the  defender  was 
a  bishop,  and  vice  verse);  the  reconfigure  trans¬ 
form  increased  the  distance  between  the  defend¬ 
ers  and  changed  the  non-king  defender  into  the 
same  type  of  piece  as  the  attacker  piece.  In  addi¬ 
tion  to  these  problems,  a  set  of  problems  to  which 
cither  a  fork  orpin  solution  could  be  applied  were 
given.  These  ambiguous  problems  will  not  be 
discussed  here.  A  single  randomly  generated 
order  for  the  42  problems  was  created,  with  at¬ 
tack  and  defend  problems  alternating. 

Procedure.  Participants  were  given  prac¬ 
tice  trials  that  tested  their  knowledge  of  the 
moves  of  chess,  and  which  gave  them  practice 
with  recognizing  the  pieces,  and  with  moving 
them  around  with  the  mouse.  To  teach  them 
about  attack  and  defend  goals,  they  were  given 
knight-fork-attack  problems,  and  then  knight- 
fork -defense  problems.  For  each  of  these  sets 
of  problems,  they  had  to  correctly  solve  three 
consecutive  problems,  before  they  went  on  to 
the  next  stage.  Each  training  problem  in  a  set 
was  the  same  except  that  the  pieces  were  shift¬ 
ed  to  different  places  on  the  chess  board.  The 
computer  showed  participants  a  correct  solu¬ 
tion  if  they  were  incorrect. 

The  link  between  attack  and  defense  was 
made  very  explicit  in  the  participants*  instruc¬ 
tions.  For  defend  problems,  they  were  advised 
to  first  think  of  where  they  themselves  would 
place  the  attacker,  if  they  had  the  chance.  Then 
they  should  move  a  defender  to  render  that 
placement  harmless. 

Two  more  training  sets  were  then  given, 
the  nature  of  which  depended  on  the  training 
condition  of  a  participant.  Participants  were 
given  the  type  of  training  problem  specified  by 
their  condition,  plus  one  other  set.  If  they  re¬ 
ceived  BPA  or  RPA  training,  then  they  received 
an  extra  set  of  knight-fork -defense  problems. 
However,  those  given  BPD  or  RPD  training 
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received  an  extra  set  of  knight-fork-attack  prob¬ 
lems.  This  equalized  the  amount  of  training  on 
attack  and  defend  goals.  After  the  training  was 
successfully  completed,  participants  were  giv¬ 
en  the  42  test  problems. 

Results 

The  mean  number  of  errors  made  by  par¬ 
ticipants  was  5.5  {SD  =  3.14)  out  of  42.  Partic¬ 
ipants  had  very  high  accuracy  for  attack  prob¬ 
lems  (96%  correct),  and  also  high  accuracy  on 
defend  problems  (78%  correct). 

Response  times  were  skewed  so  a  log  trans¬ 
formed  was  applied  to  them.  Given  the  high 
solution  rate,  the  critical  dependent  measure 
was  log  response  times  for  pin  problems,  ig¬ 
noring  whether  a  problem  was  correctly  solved. 
There  was  no  evidence  of  a  speed-accuracy 
trade-off,  as  number  of  errors  correlated  posi¬ 
tively  with  response  time,  though  not  signifi¬ 
cantly,  r(108)  =  .11,  p  =  .25.  Response  times 
unclassified  by  change  type  showed  a  signifi¬ 
cant  linear  trend,  F(l,104)  =  157.30,  p  <  .001 
(Af5!E  =  .47),  indicating  that  participants  became 
faster  as  they  completed  more  problems.  There 
was  no  effect  of  training  condition  on  response 
time,  F(3,104)  =  .50  (MSE  =  2.86),  indicating 
equivalence  of  the  overall  effects  of  training. 

Pin  problems  were  then  classified  by 
change  type  (no-change,  change-piece,  change- 
goal,  change-both)  and  the  mean  response  times 
for  each  of  the  four  change  types  across  the  four 
transformations  are  presented  in  Table  1 . 

A  4x2x2x2x4  mixed  ANOVA  was  carried 
out  with  between-subject  factors  of  training 
type  (four  levels)  and  ambiguous  set  (two  lev¬ 
els),  and  within-subject  factors  of  goal  (same 
or  different),  piece  (same  or  different),  and 
transform  (four  levels).  There  were  main  effects 
ofgoal,F(l,100)=18.38,p<.001  (M5F=  .35), 
and  piece,  F(l,  100)  =  5.61,  p  =  .020  {MSE  = 
.30),  and  an  almost  significant  interaction  be¬ 
tween  goal  and  piece,  F(1 ,100)  =  2.99,p  =  .087 
(MSE  =  .30).  There  was  also  a  main  effect  of 
transform,  F(3,300)  =  25.24,  p  <  .001  {MSE  = 
.23),  but  there  were  no  significant  interactions 
with  transform:  transform  by  goal,  F(3,300)  = 
1.29,  p  =  .28  {MSE  =  .25);  transform  by  piece. 


no- 

change- 

change- 

change- 

change 

piece 

goal 

both 

2.66 

2.77 

2.83 

2.85 

(-41) 

(.54) 

(.45) 

(.45) 

Table  1.  Mean  log  response  times  fSD  in  parentheses) 
for  each  type  of  problem. 


F(3,300)  =  .76  {MSE  =  .22);  transform  by  piece 
by  goal,  F(3,300)  =  1.19,  p  =  .31  {MSE  =  .26). 

The  differences  between  critical  groups 
were  tested  to  determine  which  of  the  predict¬ 
ed  differences  were  present.  The  no-change 
problems  were  solved  faster  than  the  change- 
piece  problems,  F(l,100)  =  10.04,  p  =  .002 
{MSE  =  .24).  However,  change-goal  problems 
did  not  differ  from  change-both  problems, 
F(l,100)  =  .17  {MSE  =  ,37).  Therefore,  when 
the  goal  did  not  change,  there  was  a  clear  effect 
of  piece,  but  there  was  no  piece  effect  when 
the  goal  was  changed.  The  difference  between 
the  change-piece  and  change-both  sets  of  prob¬ 
lems  was  almost  significant,  F(l,100)  =  3.75, 
p  =  .056  {MSE  =  .34),  suggesting  that  changing 
the  goal  had  an  effect  in  addition  to  chatiging 
the  attacker. 

Control  comparison.  Responses  to  prob¬ 
lems  requiring  a  fork  solution  allowed  a  control 
comparison.  If  transfer  occuired  from  pin  train¬ 
ing  to  fork  problems,  then  fork  problems  should 
have  been  affected  by  which  pin  training  prob¬ 
lem  was  given.  To  test  this,  fork  problems  were 
classified  as  no-change,  change-goal,  change- 
piece  or  change-both,  as  though  they  were  pin 
problems.  The  mean  log  response  times  across 
transforms  were:  for  no  change,  M  =  2,86  {SD  = 
.41);  for  change-piece,  M  =  2.90  {SD  =  .49);  for 
change-goal,  M  =  2.93  {SD  =  .45);  and  for 
change-both,  M = 2.93  {SD  =  .49).  There  was  no 
effect  of  piece,  F(l,100)  =  .70  {MSE  =  .27),  or 
goal,  F(l,100)  =  1.90,  p  =  .17  {MSE  =  ,41)  nor 
an  interaction  between  these  factors,  F(1 , 100)  = 
.54  {MSE  =  .25).  Surface  transform  did  not  in¬ 
teract  with  anything  (all  F’s  <  1.0).  Therefore 
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there  was  no  evidence  of  transfer  from  pin  train¬ 
ing  problems  to  fork  problems. 

Discussion 

The  experiment  found  clear  effects  of  goal 
and  piece  changes.  However,  If  the  goal  was 
the  same  as  that  used  in  the  pin  training  prob¬ 
lem,  then  responding  was  faster  when  the  same 
attacker  piece  was  used  than  when  a  different 
attacker  piece  was  used  in  the  problem.  Add¬ 
ing  a  goal  change  to  a  piece  change  slowed  re¬ 
sponding,  but  when  the  goal  was  changed,  there 
was  no  difference  between  piece  change  con¬ 
ditions.  Therefore  the  results  supported  the 
strong  filter  hypothesis  rather  than  the  filter- 
attenuation  hypothesis,  as  structure  was  only 
relevant  when  the  pragmatics  matched. 

If  pragmatics  have  their  effects  outside  of 
the  mapping  process,  then  the  results  seem  more 
consistent  with  explaining  the  pragmatic  effects 
as  due  to  pre-mapping  processes  rather  than  a 
post-mapping  process.  If  pragmatics  had  an  ef¬ 
fect  after  mapping  then  a  independent  main 
affect  of  piece  would  be  expected,  given  that 
the  piece  similarity  would  have  an  effect  be¬ 
fore  the  goal  would. 

The  results  do  not  necessarily  disconfirm 
the  claim  that  goals  can  affect  the  process  of 
analogical  mapping.  As  Spellman  and  Holyoak 
(1996)  pointed  out,  the  strong  filter  of  Centner 
(1989)  is  a  special  case  of  the  continuum  rep¬ 
resented  by  an  attenuated  filter.  Perhaps  when 
the  goal  is  of  high  importance,  the  filter  may 
be  strong  and  allow  little  other  information 
through.  Such  an  argument  raises  the  question 
of  what  determines  how  strong  is  the  filter,  oth¬ 
erwise  the  attenuated-filter  hypothesis  becomes 
undisconflrmable. 

Processing  unsuccessful  analogies.  The 
core  of  the  argument  over  pragmatics  concerns 
how  analogies  are  processed,  and  the  data  pro¬ 
vide  another  form  of  evidence  that  may  address 
this  issue.  For  many  cognitive  processes,  a  key 
form  of  evidence  has  been  what  happens  when 
they  fail.  It  is  known  that  good  analogies  can 
lead  to  poor  inferences  when  the  analogy  is  in¬ 
appropriate,  but  what  is  the  process  when  an 
otherwise  good  analogy  fails  to  lead  to  any  ap¬ 


plicable  solution?  None  of  the  models  of  anal¬ 
ogy  explicitly  address  this  issue,  nonetheless 
some  intuitions  could  be  derived  about  what 
might  happen.  It  would  seem  that  within  a  seri¬ 
al  model,  such  as  Falkenhaincr  el  al’s  (1989) 
SME,  a  goal  that  leads  to  an  analogy  that  is  in¬ 
applicable  should  result  in  a  slower  solution 
than  when  the  goal  docs  not  lead  to  an  inappro¬ 
priate  analogy.  Such  a  goal  would  be  more  like¬ 
ly  to  lead  the  problem  solver  down  the  wrong 
path,  and  further  down  this  wrong  path,  and  thus 
add  to  the  total  time  to  find  an  appropriate  so¬ 
lution  (analogically  or  otherwise).  Thus,  it 
would  appear  that  response  time  should  depend 
on  how  ‘good’  the  target  problem  was  as  an 
analog  to  the  inapplicable  source  problem.  In 
contrast,  in  a  parallel  model,  such  as  Holyoak 
and  Thagard’s  (1989)  ACME,  in  which  the 
pragmatics  and  the  solution  arc  all  part  of  the 
mapping  process,  then  failure  would  be  indi¬ 
cated  by  failure  to  converge.  When  ACME  con¬ 
verges,  it  could  be  assumed  that  it  will  be  fast¬ 
er  the  better  the  analogy  is.  However,  when  it 
fails  to  converge  it  should  take  the  same  amount 
of  time  to  recognize  that  convergence  is  not 
occurring,  no  matter  how  good  the  analogy  oth¬ 
erwise  appears  to  be  (though  this  is  based  on 
hypothesizing  a  mechanism  in  ACME  for  rec¬ 
ognizing  failure  to  converge,  something  it  docs 
not  have).  Therefore,  how  good  a  target  is  to  an 
inapplicable  source  analogy  should  not  affect 
solution  time,  assuming  that  it  is  clear  whether 
a  solution  can  be  applied  or  not. 

The  fork  problem  response  times  provided 
some  empirical  data  about  inappropriate  anal¬ 
ogies.  Participants  did  not  know  their  pin  train¬ 
ing  would  not  be  applicable  to  these  fork  prob¬ 
lems,  until  they  tried  to  map  the  pin  solution  to 
the  new  problem.  If  goals  arc  important,  as  the 
pin  problem  data  suggest  they  arc,  then  simi¬ 
larity  of  goals  should  have  increased  the  chance 
that  the  pin  training  problem  would  have  been 
retrieved  when  it  had  the  same  goal  as  a  prob¬ 
lem  requiring  a  fork  solution.  The  more  analo¬ 
gous  a  fork  problem  w^as  to  the  pin  training 
problem,  the  slower  should  have  been  the  par¬ 
ticipants’  responding.  Yet  there  were  no  dif¬ 
ferences  between  response  times  for  different 
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types  of  fork  problems,  no  matter  how  similar 
they  were  to  the  pin  training  problem.  Such  a 
finding  appears  to  be  consistent  with  goals  be¬ 
ing  a  part  of  the  process,  rather  than  a  separate 
stage.  However,  this  requires  more  examina¬ 
tion. 
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ABSTRACT 

Three  experiments  used  an  artificial  sign 
language  to  investigate  whether  the  mapping 
of  verbal  statements  to  spatial  schemas  is  con¬ 
strained  by  similarity  of  relational  structures. 
In  Experiment  1  adults  were  shown  diagrams 
of  hand  gestures  paired  with  locative  state¬ 
ments,  and  asked  to  judge  the  meaning  of  new 
gestures.  In  Experiment  2,  adults  were  asked  to 
make  similar  Judgments  with  active  declarative 
statements.  In  Experiment  3,  the  artificial  signs 
were  paired  with  conjunctive  and  disjunctive 
relations.  Results  of  all  three  experiments  indi¬ 
cate  that  adults  choose  a  physical  object  to  rep¬ 
resent  a  conceptual  element  and  a  physical  re¬ 
lation  to  represent  a  conceptual  relation.  These 
results  corroborate  the  structure-driven  map¬ 
ping  patterns  found  in  previous  studies  of  visu¬ 
al  reasoning,  and  provide  further  support  that 
visual  reasoning  is  based  on  general  cognitive 
constraints  on  mapping  concepts  to  space. 

From  the  early  likening  of  sound  to  waves 
to  the  more  recent  comparison  of  armies  and 
rays,  many  analogies  intertwine  spatial  and 
conceptual  components  so  tightly  that  it  seems 
difficult  to  unravel  how  they  first  came  togeth¬ 
er.  Perhaps  this  melding  is  one  reason  why 
many  investigations  of  analogy  have  involved 
comparison  of  problems  which  may  be  present¬ 
ed  verbally  or  visually  without  asking  how  the 
two  forms  of  representation  are  related.  The 
question  of  how  spatial  and  conceptual  infor¬ 
mation  are  linked  is  important  not  only  for  un¬ 
derstanding  analogy,  but  also  for  understand¬ 
ing  how  spatial  structures  influence  the  use  of 
diagrams  and  models  in  reasoning  (Glasgow, 
Narayanan,  &  Chandrasekaran,  1995),  the  struc¬ 
ture  of  languages  (Bloom,  Peterson,  Nadel,  and 


Garrett,  1996),  and  perhaps  even  the  origins  of 
abstract  cognitive  abilities  (Pinker,  1989). 

MAPPING  CONCEPTS  TO  SPACE 

Research  on  reasoning  with  spatial  repre¬ 
sentations  suggests  two  possible  principles  gov¬ 
erning  the  mapping  of  conceptual  and  spatial 
schemas  (Gattis,  1997).  Consistent  mappings 
may  derive  from  meaningful  associations,  such 
as  the  association  between  "more"  and  "up,"  or 
from  structure -driven  mapping,  matching  con¬ 
ceptual  and  spatial  schemas  based  on  structur¬ 
al  similarities. 

Association-based  Mapping 

Associations  between  physical  aspects  of 
the  world  and  conceptual  aspects  of  experience 
are  frequently  reflected  in  language,  such  as  the 
association  between  "more"  and  "up"  reflected 
in  metaphorical  expressions  like  "My  income 
rose  last  year"  (Lakoff  &  Johnson,  1980,  pp.  1 5- 
16).  Such  associations  may  influence  how  peo¬ 
ple  map  conceptual  schemas  to  spatial  schemas. 
Research  on  children"s  graphic  constructions 
indicates  that  when  asked  to  place  stickers  on  a 
piece  of  paper  to  represent  increases,  children 
representing  quantitative  increases  in  a  verti¬ 
cal  direction  are  more  likely  to  place  the  low¬ 
est  level  (i.e.  "a  small  amount")  at  the  bottom 
of  the  page  and  the  highest  level  (i.e.  "a  really 
big  amount")  at  the  top  of  the  page  (Gattis, 
1997;  Tversky,  Kugelmass,  &  Winter,  1991). 
Similarly,  adults  asked  to  map  relational  terms 
to  vertical  or  horizontal  lines  mapped  "above" 
and  "below,"  "better"  and  "worse,"  and  "more" 
and  "less"  most  often  to  a  vertical  axis,  with 
the  first  term  of  each  pair  at  the  top,  and  the 
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latter  term  at  the  bottom  (Handel,  DeSoto,  & 
London,  1968). 

Structure-driven  Mapping 

Association-based  mappings  appear  to  be 
inadequate,  however,  for  explaining  the  diverse 
interactions  of  mapping  patterns  in  visual  rea¬ 
soning.  The  direction  and  strength  of  some 
mapping  patterns  are  not  easily  explained  by 
association-based  mapping,  such  as  the  tenden¬ 
cy  to  map  "steeper"  and  "faster"  reported  not 
only  in  adults  but  also  in  young  children  with 
no  graphing  experience  (Gattis,  1997;  Gattis  & 
Holyoak,  1996).  Our  experience  in  the  physi¬ 
cal  world  is  as  likely  to  lead  to  an  association 
between  "steeper"  and  "slower"  as  between 
"steeper"  and  "faster,"  since  steeper  hills  lead 
to  slower  rates  of  travel  uphill  and  faster  rates 
of  travel  downhill. 

In  addition,  association-based  mappings 
may  come  into  conflict,  and  when  multiple 
mappings  conflict,  some  mappings  reliably  take 
precedence  over  others.  Gattis  and  Holyoak 
(1996)  asked  adults  to  reason  with  graphic  con¬ 
structions  which  contrasted  two  natural  map¬ 
pings:  the  iconic  mapping  of  "up"  on  a  vertical 
line  and  "up"  in  the  atmosphere  against  the 
metaphoric  mapping  between  steeper  slope  and 
faster  rate  of  change.  The  latter  mapping  exert¬ 
ed  a  stronger  influence  on  reasoning  perfor¬ 
mance.  Thus  a  coherent  system  appears  to  guide 
which  mapping  is  used,  even  if  some  mappings 
may  be  derived  from  prior  associations. 

A  second  explanation  for  mapping  consis¬ 
tencies  is  that  mappings  between  concepts  and 
space  are  based  on  general  constraints  govern¬ 
ing  the  mapping  process,  rather  than  or  in  addi¬ 
tion  to  specific  associations.  An  example  of 
such  a  general  constraint  is  the  tendency  ob¬ 
served  in  analogical  mapping  to  map  two  con¬ 
cepts  based  on  structural  similarities  (Gentner, 
1983).  Structure-driven  mapping  is  appealing 
because  it  can  explain  reported  mapping  pat¬ 
terns  for  reasoning  both  about  quantities  and 
about  rates.  When  Gattis  (1997)  asked  young 
children  to  reason  about  quantity  or  rate  using 
graph-like  diagrams,  children "s  judgment  pat¬ 
terns  revealed  two  highly  consistent  mappings 


of  concepts  to  spatial  dimensions:  quantity  was 
inferred  from  the  height  of  a  line  and  rate  was 
inferred  from  the  slope  of  a  line.  Mapping  of 
concepts  to  space  thus  Jippears  to  be  governed 
by  relational  structure.  Young  children  mapped 
quantity  to  height — structurally  similar  because 
they  are  both  relations  between  elements — and 
rate  to  slope — structurally  similar  because  they 
are  both  relations  between  relations. 

Mapping  Relational  Structure 

The  studies  reported  here  focus  on  very  sim¬ 
ple  relational  structures  —  elements  and  rela¬ 
tions  between  elements — to  further  explore  how 
relational  structure  is  defined  in  conceptual  and 
spatial  schemas.  Three  experiments  used  an  ar¬ 
tificial  sign  language  to  investigate  whether 
adults"  conceptual  interpretations  of  complete¬ 
ly  novel  spatial  schemas  would  also  be  charac¬ 
terized  by  structure-driven  mapping.  If  visual 
reasoning  is  indeed  based  on  mapping  relational 
structures  from  conceptual  to  spatial  schemas, 
judgment  patterns  ought  to  reflect  mapping  of 
conceptual  elements  to  physical  objects  and  con¬ 
ceptual  relations  to  physical  relations. 

Experiment  1: 

Relational  Stnictureln 
Locative  Statements 

Experiment  1  examined  whether  relation¬ 
al  structure  influences  mapping  of  locative 
statements  to  spatial  schemas  by  asking 
adults  to  interpret  an  artificial  sign  language 
in  a  three  phase  procedure.  The  first  phase 
assigned  a  specific  meaning  to  each  hand,  as 
seen  in  Figure  1 .  The  second  phase  paired  two 
signs  made  with  the  right  hand  with  two  sim¬ 
ple  locative  statements  involving  the  object 
represented  by  the  right  hand.  The  two  signs 
were  touching  the  right  ear  with  the  right  hand 
and  touching  the  left  ear  with  the  right  hand, 
as  seen  in  Figure  2.  These  two  signs  were  in¬ 
tentionally  ambiguous:  the  assignment  of  one 
locative  to  each  sign  leaves  open  whether  it 
is  the  object  touched  by  the  hand  (right  ear, 
left  ear)  or  the  relation  of  the  hand  to  body 
(ipsilaterial,  contralateral)  that  carries  mean¬ 
ing.  The  third  phase  introduced  two  comple- 
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mentary  signs  made  with  the  left  hand  (touch¬ 
ing  the  left  ear  with  the  left  hand  and  touch¬ 
ing  the  right  ear  with  the  left  hand,  illustrat¬ 
ed  in  Figure  3),  and  asked  participants  to 
judge  which  of  two  new  locative  statements 
was  represented  by  each  new  sign. 

The  four  locative  statements  used  in  Ex¬ 
periment  1  were  "Mother  is  in  the  car,"  "Moth¬ 
er  is  in  the  office,"  "Father  is  in  the  car,"  and 
"Father  is  in  the  office."  Two  types  of  rela¬ 
tional  structure  were  contrasted  by  varying 
which  aspect  of  the  statement  was  clearly 
mapped  and  which  aspect  of  the  statement  was 
ambiguously  mapped.  This  was  accomplished 
by  manipulating  which  aspect  of  the  locative 
statement  was  assigned  to  the  hands.  The 
meanings  assigned  to  the  right  and  left  hands 
were  either  "car"  and  "office,"  or  "mother"  and 
"father."  For  those  participants  for  whom  "car" 
and  "office"  were  assigned  to  the  hands,  the 
subjects  of  the  locative  statements  introduced 
in  the  second  phase  ("mother"  and  "father") 
were  unassigned  and  therefore  ambiguously 
mapped.  In  contrast,  for  those  participants  for 
whom  "mother"  and  "father"  were  assigned  to 
the  hands,  the  locative  predicates'  ("car"  and 
"office")  were  unassigned  and  therefore  am¬ 
biguously  mapped. 

The  expectation  was  that  structure-driv¬ 
en  constraints  on  mapping  conceptual  to  spa¬ 
tial  schemas  would  lead  people  to  map  the 
unassigned  portion  of  the  statement  to  a  struc¬ 
turally  similar  aspect  of  the  accompanying 
sign.  Assigning  "car"  and  "office"  to  the 
hands  leads  to  ambiguously  mapped  subjects, 
"mother"  and  "father,"  and  participants  in  this 
condition  were  expected  to  map  those  sub¬ 
jects  to  physical  elements  of  the  sign  (the 
right  and  left  ears).  Assigning  "mother"  and 
"father"  to  the  hands  leads  to  ambiguously 
mapped  locative  predicates,  "car"  and  "of¬ 
fice,"  and  participants  in  this  condition  were 
expected  to  map  predicates  to  a  physical  re¬ 
lation  in  the  sign  (the  ispilateral  and  contralat¬ 
eral  relations  of  the  arm  to  the  rest  of  the 
body).  These  two  mapping  patterns  were  then 
predicted  to  lead  to  opposite  judgment  pat¬ 
terns  in  the  final  phase. 


METHOD 

Participants.  One  hundred  and  thirty-eight 
students  from  the  University  of  Technology, 
Chemnitz  and  the  University  of  Munich  partici¬ 
pated  in  Experiment  1.  Experiment  materials 
were  distributed  and  completed  during  a  psychol¬ 
ogy  class,  and  participation  was  voluntary.  Ap¬ 
proximately  half  of  the  students  were  randomly 
assigned  to  each  of  the  two  conditions. 

Two  experimental  questions  at  the  end  serv^cd 
as  a  consistency  measure,  and  those  participants 
who  did  not  answer  the  two  questions  consistent¬ 
ly  were  not  included  in  the  analyses  (.see  results 
section  for  details).  Three  subjects  from  each  of 
the  two  conditions  did  not  answer  these  two  ques¬ 
tions  consistently  and  were  discarded  from  the 
analyses,  resulting  in  72  subjects  in  the  S  condi¬ 
tion  and  60  subjects  in  the  R  condition  (the  S  and 
R  conditions  are  explained  below). 

Procedure  and  Design.  Participants  were 
given  a  booklet  of  three  sheets  of  paper  stapled 
together.  Each  page  contained  two  illustrations, 
each  accompanied  by  a  simple  declarative  state¬ 
ment.  Each  illustration  and  the  accompanying 
statement  occupied  approximately  half  a  page, 
and  the  materials  were  organized  vertically  so 
that  the  first  illustration  occupied  the  top  half 
of  the  first  page,  the  second  illustration  occu¬ 
pied  the  bottom  half  of  the  first  page,  and  so 
on.  The  instructions  were,  "Please  read  the  fol¬ 
lowing  carefully.  At  the  end  you  will  be  asked 
questions  about  it." 

On  the  first  page  were  two  drawings,  first  a 
drawing  of  a  character  extending  his  right  hand, 
and  then  the  same  character  extending  his  left 
hand  (see  Figure  1 ).  Above  each  drawing  was  a 

sentence  "This  hand  means _ ."  For  half 

of  the  participants,  the  last  blank  was  filled  with 
the  words  "mother"  and  "father,"  with  the  or¬ 
der  counterbalanced  so  that  for  half  of  those 
participants  the  right  hand  meant,  "mother,"  and 
for  half  the  right  hand  meant,  "father."  For  the 


1  For  both  condition*;,  the  phra*:c  "is  in  the"  was  in¬ 
troduced  in  the  second  pha*;c  and  therefore  was  part  of  (he 
unasslgned  meaning.  The  phrase  gets  parsed  with  its  object 
("is  in  the  car"  or  "is  in  the  office")  however,  thus  constitut¬ 
ing  a  predicate  phrase 
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other  half  of  the  participants,  the  last  blank  was 
filled  with  the  words,  ’‘car"  and  "office,"  with 
the  order  of  car  and  office  counterbalanced  in 
the  same  way.  Varying  the  assignment  of  mean¬ 
ing  to  the  hands  was  the  primary  experimental 
manipulation,  because  it  had  the  effect  of  vary¬ 
ing  which  portion  of  the  locative  statements 
introduced  in  the  second  phase  was  unassigned. 
When  the  hands  meant  "car"  and  "office,"  the 
unassigned  portion  of  the  locative  statements 
was  the  subject  ("Subject  varies"  or  S  condi¬ 
tion).  When  the  hands  meant  "Mother"  and 
"Father,"  the  unassigned  portion  of  the  loca¬ 
tive  statements  was  the  relational  predicate 
("Predicate  varies"  or  R  condition).  All  other 
manipulated  variables  were  counterbalancing 
for  assignment  of  meaning  to  left  and  right 
hands  and  assignment  of  meaning  to  each  sign. 

On  the  second  page  were  two  new  draw¬ 
ings,  showing  the  same  character  first  touching 
his  right  ear  with  his  right  hand,  and  then  touch¬ 
ing  his  left  ear  with  his  right  hand,  or  in  the 
opposite  order.  Above  each  drawing  was  a  sen¬ 
tence.  For  the  S  condition,  the  two  sentences 
were  "This  means  "Mother  is  in  the  car""  and 
"This  means  "Father  is  in  the  car,""  or  "This 
means  "Mother  is  in  the  office""  and  "This 
means  "Father  is  in  the  office,""  depending  on 
the  counterbalancing  of  assignment  of  "car"  and 
"office"  to  right  and  left  hands  in  the  first  phase. 
For  the  R  condition,  the  two  sentences  were 
"This  means  "Mother  is  in  the  car""  and  "This 
means  "Mother  is  in  the  office,""  or  "This 
means  "Father  is  in  the  car""  and  "This  means 

/! 

Figure  h  In  the  first  phase,  text  accompanying  these 
drawings  assigned  a  particular  meaning  to  each  hand 


"Father  is  in  the  office,""  depending  on  the 
counterbalancing  of  assignment  of  "Mother" 
and  "Father"  to  right  and  left  hands  in  the  first 
phase.  Note  that  in  the  S  condition,  the  subject 
of  the  sentence  varies,  and  in  the  R  condition, 
the  object  of  the  locative  predicate  varies,  but 
all  participants  received  the  same  two  signs.  For 
both  conditions,  the  two  signs  paired  with  these 
locative  statements  are  ambiguously  mapped: 
it  is  not  clear  whether  it  is  the  object  touched 
by  the  hand  (right  ear  and  left  ear)  or  the  rela¬ 
tion  of  the  hand  to  body  (ipsilaterial  and  con¬ 
tralateral)  that  means  "Mother"  and  "Father," 
or  "car"  and  "office." 

On  the  third  page  were  two  new  drawings 
of  the  same  character  making  the  complemen¬ 
tary  signs  with  his  left  hand,  first  touching  his 
left  ear  with  his  left  hand,  and  then  touching 
his  right  ear  with  his  left  hand,  or  in  the  oppo¬ 
site  order.  Above  each  drawing  was  the  ques¬ 
tion,  "What  does  this  mean?",  and  below  each 
drawing  were  two  sentences,  with  the  instruc¬ 
tion,  "Circle  the  answer  that  fits  best."  For  the 
S  condition,  the  two  sentences  read,  ""Mother 
is  in  the  office"  OR  "Father  is  in  the  office,"" 
or  ""Mother  is  in  the  car" OR  "Father  is  in  the 
car,""  depending  on  the  counterbalancing  of 
assignment  of  "car"  and  "office"  to  right  and 
left  hands  in  the  first  phase.  For  the  R  condi¬ 
tion,  the  two  sentences  read,  ""Father  is  in  the 
car" OR  "Father  is  in  the  office,""  or  ""Mother 
is  in  the  car"OR  "Mother  is  in  the  office,""  de¬ 
pending  on  the  counterbalancing  of  assignment 
of  "Mother"  and  "Father"  to  right  and  left  hands 


Figure  2.  In  the  second  phase,  two  signs  made  with  the 
r^ht  hand  were  accompanied  by  two  locative  statements 
(Experiment  1),  two  active  declarative  statements 
(Experiment  2),  or  two  conjunctive  or  disjunctive 
statements  (Experiment  3). 
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in  the  first  phase.  For  both  conditions  the  order 
of  these  two  sentences  was  also  counterbal¬ 
anced.  As  in  the  second  phase,  in  the  S  condi¬ 
tion,  the  subject  of  the  sentence  varies,  and  in 
the  R  condition,  the  object  of  the  locative  pred¬ 
icate  varies,  but  all  participants  received  the 
same  two  illustrations  of  signs  made  with  the 
left  hand  (see  Figure  3). 


Figure  3.  In  the  third  phase,  two  stgm  made  wUh  the 
left  hand,  complementary  to  those  previomfy  shown 
with  the  right  hand,  were  used  to  probe  conceptual’ 
spatial  mappings. 

RESULTS  AND  DISCUSSION 

Asking  participants  to  judge  the  meaning 
of  two  signs  with  the  left  hand  allowed  for  a 
consistency  measure,  in  that  the  two  signs  log¬ 
ically  ought  to  have  two  different  meanings. 
Participants  who  circled  the  same  locative  state¬ 
ment  for  both  signs  were  therefore  considered 
to  have  given  inconsistent  answers,  and  were 
discarded  from  the  analyses. 

The  answers  given  by  the  remaining  par¬ 
ticipants  were  then  coded  by  whether  the  unas¬ 
signed  meaning  was  mapped  to  the  ears  (ob¬ 
ject-based  or  O  mappings)  or  to  the  ipsilateral 
and  contralateral  bodily  relations  (relational  or 
R  mappings).  This  was  easily  derived  by  com¬ 
paring  the  circled  statements  for  each  sign  with 
the  statement-sign  pairs  on  the  previous  page. 
For  example,  a  participant  in  the  S  condition 
(subject  varies)  circled  "Father  is  in  the  office" 
as  the  meaning  of  the  left-hand-to-right-ear 
sign,  and  "Mother  is  in  the  office"  as  the  mean¬ 
ing  of  the  left-hand-to-left-ear  sign.  Since  it  was 
already  known  that  for  this  person  "car"  and 


"office"  were  assigned  to  right  and  left  hand 
respectively,  comparing  these  judgments  with 
the  statement-sign  pairings  on  the  second  page, 
where  the  right-hand-to-left-ear  sign  was  la¬ 
belled  "Mother  is  in  the  car"  and  the  right-hand- 
to-right-ear  sign  was  labelled  "Father  is  in  the 
car,"  indicated  that  this  person  mapped  the  vary¬ 
ing  subjects,  "mother"  and  "father"  to  the  left 
and  right  ears,  respectively.  This  answer  was 
coded  as  an  object-based  (O)  mapping.  Had  this 
same  participant  selected  "Mother  is  in  the  of¬ 
fice"  and  "Father  is  in  the  office"  for  the  left- 
hand-to-right-ear  and  left-hand-to-Ieft-ear  signs 
respectively,  that  would  have  been  coded  as  a 
relational  (R)  mapping. 

Compare  that  with  the  answers  given  by  a 
participant  in  the  R  condition  (predicate  var¬ 
ies).  This  person  circled  "Father  is  in  the  car" " 
as  the  meaning  of  the  left-hand-to-right-ear  sign 
and  "Father  is  in  the  office"  as  the  meaning  of 
the  left-hand-to-left-ear  sign.  Since  it  was  al¬ 
ready  known  that  for  this  person  "mother"  and 
"father"  were  assigned  to  right  and  left  hand 
respectively,  comparing  these  judgments  with 
the  statement-sign  pairings  on  the  second  page, 
where  the  right-hand-to-left-ear  sign  was  la¬ 
belled  "Mother  is  in  the  car"  and  the  right-hand- 
to-right-ear  sign  was  labelled  "Mother  is  in  the 
office,"  indicated  that  this  person  mapped  the 
varying  predicates  "is  in  the  car"  and  "is  in  the 
office"  to  the  contralateral  and  ipsilateral  bodi¬ 
ly  relations,  respectively.  This  answer  was  cod¬ 
ed  as  a  relational  (R)  mapping.  Had  this  same 
participant  selected  "Father  is  in  the  office"  and 
"Father  is  in  the  car"  for  the  left-hand-to-right- 
car  and  left-hand-to-left-ear  signs  respective¬ 
ly,  that  would  have  been  coded  as  an  object- 
based  (O)  mapping. 


R 

O 

total 

Subject  varies 
(Mother/Fathcr) 

26 

46 

72 

Predicate  varies 

(is  in  the  car/is  in  the  office) 

43 

18 

60 

Table  1.  Frequencies  of  relational  (R)  and  object-based 
(0)  mappings  for  conditions  in  which  subject  varies  and 
in  which  predicate  varies  in  Experiment  /. 
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The  frequency  of  relational  and  object- 
based  mappings  were  then  compared.  As  can 
be  seen  in  Table  1,  the  assignment  of  meaning 
to  an  ambiguous  sign  was  determined  by  which 
aspect  of  the  locative  statement  was  unassigned. 
Participants  in  the  S  condition  (subject  varies) 
were  more  likely  to  make  object-based  map¬ 
pings,  whereas  participants  in  the  R  condition 
(predicate  varies)  were  more  likely  to  make  re¬ 
lational  mappings.  These  two  patterns  of  re¬ 
sponse  for  the  S  condition  and  the  R  condition 
were  significantly  different  C^(l,  N  =  133)  = 
15.64,  p  <  .001.  The  overall  frequency  of  the 
two  mapping  patterns  were  approximately  the 
same:  combining  the  two  experimental  condi¬ 
tions,  participants  chose  relational  and  object- 
based  mappings  with  similar  frequency. 

The  results  of  Experiment  1  were  consis¬ 
tent  with  structure-driven  mapping  of  concep¬ 
tual  to  spatial  schemas.  When  the  subject  of  the 
locative  statement  was  unassigned,  participants 
mapped  the  unassigned  subjects  to  physical 
objects — the  right  and  left  ears.  When  the  pred¬ 
icate  of  the  locative  statement  was  unassigned, 
participants  mapped  the  unassigned  predicates 
to  physical  relations  —  the  ipsilateral  and  con¬ 
tralateral  relation  of  the  arm  to  the  body. 

Experiment  2: 

Relational  Structure 
In  Active  Declarative  Statements 

The  results  of  Experiment  1  indicated  that 
relational  structure  influences  the  mapping  of 
locative  statements  to  novel  spatial  schemas. 
Experiment  2  investigated  whether  mapping 
active  declarative  statements  to  novel  spatial 
schemas  would  reveal  the  same  pattern  of  struc¬ 
ture-driven  mapping.  Experiment  2  used  the 
same  diagrams  and  three  phase  procedure  as 
Experiment  1.  Whereas  Experiment  1  paired 
signs  with  simple  locative  statements,  such  as 
"Mother  is  in  the  office"  and  "Mother  is  in  the 
car,"  Experiment  2  paired  signs  with  active  de¬ 
clarative  statements. 

The  active  declarative  statements  used  in 
Experiment  2  were  about  animal  characters 
performing  some  action  toward  another  animal 
character:  for  example,  "Monkey  visits  Mouse," 
and  "Monkey  bites  Mouse."  As  in  Experiment 


1,  different  types  of  relational  structure  were 
contrasted  by  varying  which  aspect  of  the  state¬ 
ment  was  clearly  mapped  and  which  aspect  of 
the  statement  was  ambiguously  mapped  by  as¬ 
signing  a  particular  aspect  of  the  statements  to 
the  hands.  The  hands  always  signified  two  an¬ 
imals,  assigned  during  the  first  phase  using  the 
same  procedure  as  Experiment  1 . 

In  the  statement  pairs  introduced  in  the  sec¬ 
ond  and  third  phases,  either  the  subject  varied 
(S  condition),  the  relation  varied  (R  condition), 
or  the  object  varied  (O  condition).  When  the 
subject  varied  (S  condition),  the  meanings  as¬ 
signed  to  the  hands  always  became  the  objects 
of  the  action,  and  two  subjects  were  introduced. 
The  relation  was  constant  for  an  individual  par¬ 
ticipant,  but  varied  between-subjects.  For  in¬ 
stance,  a  participant  in  the  S  condition  for  whom 
"Monkey"  had  been  assigned  to  the  right  hand 
in  the  first  phase,  in  the  second  phase  might 
have  read  the  statements,  "Mouse  visits  Mon¬ 
key"  and  "Bear  visits  Monkey,"  each  paired 
with  a  diagram  of  a  sign  made  with  the  right 
hand  (see  Figure  2). 

When  the  relation  varied  (R  condition),  the 
meanings  assigned  to  the  hands  became  the 
subjects  of  the  action,  and  two  relations  were 
introduced.  A  participant  in  the  R  condition  for 
whom  "Monkey"  had  been  assigned  to  the  right 
hand  in  the  first  phase,  in  the  second  phase 
might  have  read  the  statements,  "Monkey  vis¬ 
its  Mouse"  and  "Monkey  bites  Mouse,"  also 
paired  with  the  right  hand  signs  (Figure  2). 

When  the  object  varied  (O  condition),  the 
meanings  assigned  to  the  hands  became  the 
subjects  of  the  action,  and  two  objects  were 
introduced.  A  participant  in  the  O  condition  for 
whom  "Monkey"  had  been  assigned  to  the  right 
hand  in  the  first  phase,  in  the  second  phase 
might  have  read  the  statements,  "Monkey  vis¬ 
its  Mouse"  and  Monkey  visits  Bear,"  each 
paired  with  a  right  hand  sign  (Figure  2). 

As  in  Experiment  1 ,  in  Experiment  2  the 
expectation  was  that  structure-driven  con¬ 
straints  on  mapping  conceptual  to  spatial  sche¬ 
mas  would  lead  people  to  map  the  unassigned 
and  varying  portion  of  the  statement  to  a  struc¬ 
turally  similar  aspect  of  the  accompanying  sign. 


215 


Merideth  Gattis 


Varying  subjects  and  varying  objects  were  ex¬ 
pected  to  lead  to  more  object-based  mappings, 
assigning  meaning  to  the  right  and  left  ears. 
Varying  relations,  in  contrast,  were  expected 
to  lead  to  more  relational  mappings,  assigning 
meaning  to  the  ispilateral  and  contralateral  re¬ 
lations  of  the  arm  to  the  rest  of  the  body.  Ex¬ 
periment  2  thus  manipulated  three  different 
aspects  of  statements,  which  when  mapped  to 
spatial  schemas  were  expected  to  lead  to  two 
distinct  mapping  patterns:  both  varying  subjects 
and  varying  objects  should  lead  to  object-based 
mapping  of  conceptual  to  spatial  schemas, 
whereas  varying  relations  should  lead  to  rela¬ 
tional  mapping  of  conceptual  to  spatial  sche¬ 
mas.  These  two  mapping  patterns  were  again 
predicted  to  lead  to  opposite  judgment  patterns 
in  the  final  phase. 

METHOD 

Participants.  One  hundred  and  fifty-four 
students  from  the  the  University  of  Munich 
participated  in  Experiment  2  during  psycholo¬ 
gy  classes.  Participation  was  voluntary.  Approx¬ 
imately  one-third  of  the  students  were  randomly 
assigned  to  each  of  the  three  conditions. 

As  in  Experiment  1 ,  two  experimental  ques¬ 
tions  at  the  end  served  as  a  consistency  mea¬ 
sure.  Six  subjects  in  the  S  condition,  and  two 
subjects  in  each  of  the  remaining  conditions, 
did  not  answer  these  two  questions  consistent¬ 
ly  and  were  discarded  from  the  analyses,  re¬ 
sulting  in  32  subjects  in  the  S  condition  and  58 
subjects  in  the  R  condition,  and  52  subjects  in 
the  O  condition. 

Procedure  and  Design.  The  procedure  and 
materials  were  nearly  identical  to  those  of  Ex¬ 
periment  1,  with  the  change  that  the  statements 
paired  with  signs  in  the  second  and  third  phas¬ 
es  were  active  declarative  statements,  and  there 
were  three  experimental  conditions:  subject 
varying  (S  condition),  relation  varying  (R  con¬ 
dition),  and  object  varying  (O  condition). 

On  the  first  page  of  the  experimental  book¬ 
let  were  two  drawings  of  a  character  extending 
his  right  and  then  left  hand  (see  Figure  1 ),  paired 
with  the  sentences  "This  hand  means 
(Animal  1 ),"  and  "This  hand  means  (Animal2)." 


Whereas  in  Experiment  I  varying  the  assign¬ 
ment  of  meaning  to  the  hands  was  related  to 
the  primary  experimental  manipulation,  in  Ex¬ 
periment  2  it  was  independent.  Both  the  sub¬ 
jects  and  the  objects  of  the  active  declarative 
statements  were  animal  characters,  and  any 
animal  could  be  assigned  to  the  hands.  Four 
animals  were  chosen  —  Monkey,  Elephant, 
Mouse,  and  Bear  —  and  a  random  ordering  of 
these  four  was  created.  Three  new  orderings 
were  created  by  rotating  the  the  list.  The  other 
three  orders  were  thus:  Elephant,  Mouse,  Bear, 
and  Monkey;  Mouse,  Bear,  Monkey,  and  Ele¬ 
phant;  and  Bear,  Monkey,  Elephant,  and 
Mouse.  The  first  position  in  the  list  was  Ani¬ 
mal  1,  the  second  position  in  the  list  was  Ani- 
mal2,  and  so  on.  T^ese  four  orders  were  coun¬ 
terbalanced  between  subjects. 

On  the  second  page  of  the  booklet  were  two 
more  drawings  of  the  same  character  touching 
his  right  ear  with  his  right  hand,  and  touching 
his  left  ear  with  his  right  hand  (sec  Figure  2), 
with  the  order  of  these  two  drawings  counter¬ 
balanced  across  subjects.  Above  each  drawing 
was  a  sentence.  For  the  S  condition,  the  two 
sentences  were  of  the  form,  "This  means  "An- 
imal3  R-action  Animal  I""  and  "This  means 
"Animal4  R-action  Animal  1 The  relation  (R- 
action)  was  cither  "visits"  or  "bites"  and  was 
counterbalanced  across  subjects.  For  the  R  con¬ 
dition,  die  two  sentences  were  of  the  form,  "This 
means  "Animal  1  R-actionl  Animal3""  and 
"This  means  "Animal  1  R-action2  Animal3."" 
As  with  the  S  condition,  the  relations  were  "vis¬ 
its"  and  "bites."  and  the  order  of  these  two  rela¬ 
tions  was  counterbalanced  across  subjects.  For 
the  O  condition,  the  two  sentences  were  of  the 
form,  "This  means  "Animal  I  R-action  Ani- 
mal3"”  and  "This  means  "Animal  I  R-action 
Animal4,""  and  again  the  relation  was  cither 
"visits"  or  "bites"  and  was  counterbalanced 
across  subjects. 

On  the  third  page  of  the  booklet  were  two 
drawings  of  the  same  character  touching  his  left 
car  with  his  left  hand,  and  touching  his  right 
car  with  his  left  hand  (see  Figure  3),  with  the 
order  of  these  two  drawings  counterbalanced. 
Above  each  drawing  was  the  question.  "What 
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does  this  mean?”,  and  below  each  drawing  were 
two  sentences,  with  the  instruction,  "Circle  the 
answer  that  fits  best.”  For  the  S  condition,  the 
two  sentences  were  of  the  form,  "This  means 
"Animals  R-action  Animal2”  OR  This  means 
"Animal4  R-action  Animal2.""  For  the  R  con¬ 
dition,  the  two  sentences  were  of  the  form,  "This 
means  "Animal2  R-action  1  Animal 3"  OR  This 
means  ”Animal2  R-action2  AnimalS.""  For  the 
O  condition,  the  two  sentences  were  of  the  form, 
"This  means  "Animal2  R-action  Animal3"  OR 
This  means  "Animal2  R-action  Animal4.""  For 
each  condition,  the  order  of  the  two  sentences 
was  counterbalanced  across  subjects.  Note  that 
the  two  sentences  are  identical  to  those  intro¬ 
duced  in  the  second  phase,  with  the  single 
change  that  Animal2,  which  has  already  been 
assigned  to  the  left  hand,  is  substituted  for  An¬ 
imal  1  ,  to  correspond  to  the  left-handed  actions 
in  the  drawings. 

RESULTS  AND  DISCUSSION 

As  in  Experiment  1,  participants  who  cir¬ 
cled  the  same  locative  statement  for  both  signs 
were  considered  to  have  given  inconsistent  an¬ 
swers  and  were  discarded  from  the  analyses. 
The  answers  given  by  the  remaining  participants 
were  then  coded  by  whether  the  unassigned 
meaning  was  mapped  to  the  ears  (object-based 
or  O  mappings)  or  to  the  ipsilateral  and  con¬ 
tralateral  bodily  relations  (relational  or  R  map¬ 
pings),  in  the  same  manner  described  in  Exper¬ 
iment  1,  and  the  frequency  of  relational  and 
object-based  mappings  were  then  compared. 


R 

O 

total 

Subject  varies 

(Bear/Elefant/Maus/Monkey) 

11 

21 

32 

Relation  varies 
(bites/visits) 

36 

22 

58 

Object  varies 

(Bear/Elefant/Maus/Monkey) 

13 

39 

52 

Table  2.  Frequencies  of  relational  (R)  and  object-based 
(O)  mappings  for  conditions  in  which  subject  varies,  in 
which  action  predicate  varies,  and  in  which  object 
varies  in  Experiment  Z 


The  mapping  pattern  varied  significantly  be¬ 
tween  conditions  C^(2,  N  =  142)  =  16.49,  p  < 
.001 .  As  can  be  seen  in  Table  2,  participants  in 
the  S  condition  (subject  varies)  and  the  O  con¬ 
dition  (object  varies)  were  more  likely  to  make 
object-based  mappings,  whereas  participants  in 
the  R  condition  (predicate  varies)  were  more 
likely  to  make  relational-mappings. 

Experiment  3: 

Relational  Structure 

In  Conjunctive  and  Disjunctive  Statements 

Experiment  3  investigated  whether  the 
same  type  of  relational  structures  are  revealed 
in  the  mapping  of  conjunctions  and  disjunc¬ 
tions  to  artificial  signs,  using  the  same  dia¬ 
grams  and  three  phase  procedure  as  the  pre¬ 
vious  experiments.  The  statements  paired 
with  signs  in  Experiment  3  were  simple  con¬ 
junctions  and  disjunctions  of  animal  charac¬ 
ters,  such  as  "Monkey  and  Mouse,”  and 
"Monkey  or  Mouse." 

As  in  Experiments  1  and  2,  different  types 
of  relational  structure  were  contrasted  by  vary¬ 
ing  which  aspect  of  the  statement  was  clearly 
mapped  and  which  aspect  of  the  statement  was 
ambiguously  mapped  by  assigning  a  particular 
aspect  of  the  statements  to  the  hands.  As  in 
Experiment  2,  the  hands  always  signified  two 
animals,  assigned  during  the  first  phase  of  the 
experiment.  In  the  statements  paired  with  signs 
in  the  second  and  third  phases  of  the  experi¬ 
ment,  either  the  first  animal  (S  condition),  the 
second  animal  (O  condition),  or  the  relation 
between  them  (R  condition)  varied. 

When  the  first  animal  varied  (S  condition), 
two  new  animals  were  paired  with  the  animal 
previously  assigned  to  the  right  hand,  by  either 
a  conjunctive  relation  ("and")  or  a  disjunctive 
relation  ("or”).  The  relation  was  constant  for 
an  individual  participant,  but  varied  between- 
subjects.  To  compare  with  the  examples  de¬ 
scribed  in  Experiment  2,  a  participant  in  the  S 
condition  of  Experiment  3  for  whom  "Monkey" 
had  been  assigned  to  the  right  hand  in  the  first 
phase,  in  the  second  phase  might  have  read  the 
statements,  "Mouse  and  Monkey"  and  "Bear 
and  Monkey,"  each  paired  with  a  diagram  of  a 
sign  made  with  the  right  hand  (see  Figure  2). 
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The  O  condition  was  similar  to  the  S  con¬ 
dition,  with  the  difference  that  the  animals  as¬ 
signed  to  the  hands  occupied  the  first  position 
in  the  statements,  and  the  two  new  animals  oc¬ 
cupied  the  second  position  in  the  statements. 
For  example,  a  participant  in  the  O  condition 
for  whom  "Monkey"  had  been  assigned  to  the 
right  hand  in  the  first  phase,  in  the  second  phase 
might  have  read  the  statements,  "Monkey  and 
Mouse"  and  "Monkey  and  Bear,"  each  paired 
with  a  right  hand  sign  (Figure  2). 

When  the  relation  varied  (R  condition), 
both  conjunctive  ("and")  and  disjunctive  ("or") 
relations  were  introduced.  A  participant  in  the 
R  condition  for  whom  "Monkey"  had  been  as¬ 
signed  to  the  right  hand  in  the  first  phase,  in  the 
second  phase  might  have  read  the  statements, 
"Monkey  and  Mouse"  and  "Monkey  or  Mouse," 
also  paired  with  the  right  hand  signs  (Figure  2). 

As  in  the  previous  two  experiments,  the 
expectation  was  that  structure-driven  mapping 
would  pair  the  unassigned  and  varying  portion 
of  the  statement  with  a  structurally  similar  as¬ 
pect  of  the  accompanying  sign.  Varying  ani¬ 
mals,  whether  in  the  first  position  or  the  sec¬ 
ond  position,  were  expected  to  be  mapped  to 
the  ears,  an  object-based  mapping.  Varying  re¬ 
lations,  in  this  case  "and"  and  "or,"  were  ex¬ 
pected  to  be  mapped  to  the  ispilateral  and  con¬ 
tralateral  relations  of  the  arm  to  the  rest  of  the 
body.  These  two  mapping  patterns  were  again 
predicted  to  lead  to  opposite  judgment  patterns 
in  the  final  phase. 

METHOD 

Participants.  One  hundred  and  six  students 
from  the  the  University  of  Munich  and  the 
University  of  Chemnitz  participated  in  Experi¬ 
ment  3  during  psychology  classes.  Participa¬ 
tion  was  voluntary.  Approximately  one-third  of 
the  students  were  randomly  assigned  to  each 
of  the  three  conditions. 

As  in  Experiments  1  and  2,  two  experimen¬ 
tal  questions  at  the  end  served  as  a  consistency 
measure.  Four  subjects  in  the  S  condition,  one 
subject  in  the  R  condition,  and  three  subjects 
in  the  O  condition  did  not  answer  these  two 
questions  consistently  and  were  discarded  from 


the  analyses,  resulting  in  44  subjects  in  the  S 
condition  and  32  subjects  in  the  R  condition, 
and  30  subjects  in  the  O  condition. 

Procedure  and  Design.  The  procedure  and 
materials  were  nearly  identical  to  those  of  Ex¬ 
periment  2,  with  the  change  that  the  statements 
paired  with  signs  in  the  second  and  third  phas¬ 
es  were  conjuntive  pairs,  disjunctive  pairs,  or 
both.  As  in  Experiment  2,  there  were  three  ex¬ 
perimental  conditions:  first  animal  varying 
(again  called  the  S  condition,  to  allow  easy  com¬ 
parison  with  Experiment  2,  despite  the  fact  that 
in  Experiment  3  the  first  animal  is  not  the  sub¬ 
ject  of  a  sentence),  relation  varying  (R  condi¬ 
tion),  and  second  animal  varying  (again  called 
the  O  condition,  despite  the  fact  that  the  sec¬ 
ond  animal  is  not  the  object  of  a  sentence). 

RESULTS  AND  DISCUSSION 

As  in  the  two  previous  experiments,  par¬ 
ticipants  who  circled  the  same  statement  for 
both  signs  were  considered  to  have  given  in¬ 
consistent  answers  and  were  discarded  from  the 
analyses.  The  answers  given  by  the  remaining 
participants  were  then  coded  by  whether  the 
unassigned  meaning  was  mapped  to  the  ears 
(object-based  or  O  mappings)  or  to  the  ipsilat- 
cral  and  contralateral  bodily  relations  (relational 
or  R  mappings),  and  the  frequency  of  relation¬ 
al  and  object-based  mappings  were  then  com¬ 
pared.  The  mapping  pattern  varied  significant¬ 
ly  between  conditions  X^(2,  N=I21)=I0.2I, 


R 

0 

total 

First  animal  varies 
(Bear/Elefant/Maus/Monkey) 

15 

35 

50 

Relation  varies 
(and/or) 

20 

11 

31 

Second  animal  varies 
(Bear/Elefant/Maus/Monkey) 

14 

26 

40 

Tahte  3.  Frequencies  of  relational  (R)  and  ohject-hased 
(0)  mappings  for  conditions  in  which  the  first  animat 
varies,  in  which  relation  varies,  and  in  which  the  second 
animat  varies  in  Experiment  3. 
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p  <  .006.  As  can  be  seen  in  Table  3,  partici¬ 
pants  in  the  S  condition  and  the  O  condition 
were  more  likely  to  make  object-based  map¬ 
pings,  whereas  participants  in  the  R  condition 
were  more  likely  to  make  relational-mappings. 

General  Discussion 

The  three  experiments  reported  here  used 
an  artificial  sign  language  to  investigate  whether 
the  mapping  of  simple  statements  to  spatial 
schemas  is  constrained  by  similarity  of  relation¬ 
al  structures.  In  Experiment  1  adults  were 
shown  diagrams  of  hand  gestures  paired  with 
locative  statements,  and  asked  to  judge  the 
meaning  of  new  gestures.  In  Experiment  2  and 
3,  adults  were  asked  to  make  similar  judgments 
with  active  declarative  statements  and  conjunc¬ 
tive  and  disjunctive  statements,  respectively. 
Results  of  all  three  experiments  indicate  that 
adults  choose  physical  objects  to  represent  con¬ 
ceptual  elements  and  physical  relations  to  rep¬ 
resent  conceptual  relations.  These  results  cor¬ 
roborate  the  structure-driven  mapping  patterns 
found  in  previous  studies  of  visual  reasoning, 
in  which  relations  between  elements  were 
mapped  together,  and  relations  between  rela¬ 
tions  were  mapped  together  (Gattis,  1997). 

The  results  reported  here  are  also  compat¬ 
ible  with  previous  research  with  signed  lan¬ 
guages  indicating  that  that  in  signing  space, 
objects  or  actors  are  assigned  to  a  spatial  lo¬ 
cus  (Emmorey,  1996).  Interestingly,  howev¬ 
er,  these  results  also  indicate  that  nouns  are 
not  always  assigned  to  spatial  loci,  but  rather 
the  structural  role  played  by  a  noun  determines 
whether  it  is  mapped  to  a  spatial  locus  or  a 
spatial  relation.  In  Experiment  1,  the  nouns 
"car"  and  "office"  were  mapped  to  the  ispilat- 
eral  and  contralateral  bodily  relations,  not  to 
the  right  and  left  ears,  because  they  were  es¬ 
sential  parts  of  locative  relational  expressions, 
"in  the  car"  and  "in  the  office." 

One  alternative  explanation  to  the  struc¬ 
ture-driven  mapping  interpretation  suggest¬ 
ed  here  is  that  adults  are  mapping  roles  and 
movement  rather  than  structure  per  se  when 
mapping  statements  to  signs.  For  instance,  the 


ipsilateral  and  contralateral  gestures  could  be 
interpreted  as  movements  rather  than  bodily 
relations.  The  tendency  to  pair  the  locative 
expressions  "in  the  car"  and  "in  the  office" 
with  the  ipsilateral  and  contralateral  relations 
could  be  seen  to  emphasize  the  movement  to¬ 
ward  a  location,  rather  than  a  bodily  relation. 
This  explanation  is  a  variant  of  the  associa¬ 
tion-based  mapping  hypothesis,  and  assumes 
that  people  associate  movement  of  the  arms 
with  movement  to  a  location,  or  the  move¬ 
ment  of  an  action  such  as  "bite"  or  "visit."  If 
people  perceive  and  map  the  movement  path, 
however,  we  would  also  expect  that  the 
movement  marks  the  grammatical  roles  of 
subject  and  object,  as  in  signed  languages 
(Emmorey,  1996).  When  movement  is  used 
to  represent  an  action  in  ASL,  such  as  "The 
dog  bites  the  cat,"  the  direction  of  the  move¬ 
ment  marks  the  grammatical  roles  of  subejct 
and  object:  the  subject  is  the  starting  loca¬ 
tion,  and  the  object  is  the  end  location.  Were 
adults  simply  mapping  locations  and  actions 
to  the  signs  shown  here  by  mapping  location 
and  action  to  a  movement  path,  we  would 
expect  to  find  stronger  mapping  patterns  for 
those  situations  in  which  the  object  of  the 
statement  was  mapped  to  the  ears  compared 
to  those  in  which  the  subject  of  the  statement 
was  mapped  to  the  ears.  In  Experiment  2, 
however,  subjects  and  objects  of  active  de¬ 
clarative  statements  were  mapped  with  equal 
freqency  to  the  the  ears,  indicating  that  per¬ 
ceived  movement  did  not  play  an  important 
role  in  adults"  mapping  of  conceptual  sche¬ 
mas  to  spatial  schemas. 

By  introducing  a  new  paradigm  for  study¬ 
ing  the  mapping  of  conceptual  and  spatial  sche¬ 
mas,  these  experiments  also  provide  an  inter¬ 
esting  task  for  studying  relational  structure  in 
language.  The  results  of  all  three  experiments 
indicate  that  adults  asked  to  interpret  this  arti¬ 
ficial  sign  language  choose  a  distinct  mapping 
pattern,  either  object-based  or  relational,  to  map 
linguistic  structures  to  hand  gestures.  Further 
research  might  use  this  paradigm  to  address  the 
relational  structure  underlying  linguistic  utter¬ 
ances  as  well  as  reasoning  schemas. 
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When  people  apply  existing  knowledge  to 
new  tasks,  the  circumstances  surrounding  that 
application  can  vary  enormously  from  one  sit¬ 
uation  to  the  next.  Potentially  important  varia¬ 
tions  include  the  purposes  to  which  the  old  in¬ 
formation  is  put,  the  conceptual  distance  be¬ 
tween  the  old  source  and  the  new  target  domain, 
and  the  person’s  state  of  knowledge  regarding 
the  target.  Considering  some  of  these  variations 
can  help  to  provide  a  broader  context  for  the 
research  I  will  present  and  for  thinking  about 
knowledge  transfer  more  generally. 

Knowledge  from  a  familiar  source  can  be 
used  for  the  purpose  of  reasoning  about,  ex¬ 
plaining,  or  otherwise  coming  to  understand  a 
less  familiar  target  domain,  or  it  can  be  used  to 
supply  the  starting  point  or  structuring  infor¬ 
mation  needed  for  the  design  of  novel  prod¬ 
ucts,  inventions  or  other  tangible  artifacts.  As 
a  short-hand,  these  different  uses  of  existing 
knowledge  can  be  referred  to  as  explanatory 
and  inventive,  respectively. 

In  terms  of  conceptual  distance,  the  source 
and  target  can  come  from  the  same  conceptual 
domain,  from  related,  though  nonidentical  do¬ 
mains,  or  from  wildly  discrepant  domains,  (e.g., 
Dunbar,  1997;  Vosniadou  &  Ortony,  1989).  For 
ease  of  reference,  those  continuous  variations 
can  be  labeled  loosely  with  the  dichotomous 
terms  near  and  distant. 

Finally,  individuals  seeking  to  apply  source 
knowledge  to  a  target  situation  may  know  a 
great  deal  or  next  to  nothing  about  the  target. 
As  discussed  below,  initial  knowledge  about  the 


structure  of  the  target  should  be  richer  in  the 
explanatory  than  in  the  inventive  case. 

A  PARTITIONING  OF  CASES 

The  explanatory/inventive  and  near/distant 
distinctions  can  be  used  to  partition  knowledge 
transfer  situations  into  several  types.  For  ex¬ 
ample,  classic  instances  of  real-world  analogies, 
particularly  those  involved  in  scientific  discov¬ 
ery,  are  typically  characterized  by  the  use  of  a 
well-known,  but  conceptually  distant  source 
domain  to  explain  or  understand  a  relatively  less 
familiar  target  domain.  An  oft  noted  instance 
of  this  type  of  distant/explanatory  analogy  is 
Rutherford’s  comparison  between  the  familiar 
structure  of  a  solar  system  and  the  (then)  rela¬ 
tively  unknown  structure  of  the  atom.  Another 
less  noted,  but  equally  striking  instance  is  Ke¬ 
pler’s  analogy  between  the  properties  of  light 
and  a  hypothetical  motive  power  of  the  sun 
which  he  invoked  to  try  to  explain  planetary 
motion  (Centner,  Brem,  Ferguson,  Wolff, 
Markman,  &  Forbus,  1 997), 

Distant  sources  are  also  reported  to  serve 
the  purpose  of  envisioning,  designing,  and  pro¬ 
ducing  novel  inventions.  A  frequently  cited  in¬ 
stance  of  this  type  of  distant/inventive  analo¬ 
gy  is  the  role  of  burrs  in  the  invention  of  vel¬ 
cro.  According  to  the  story,  when  velcro’s  in¬ 
ventor,  George  de  Mestral,  used  a  microscope 
to  examine  burrs  that  had  attached  to  his  cloth¬ 
ing,  he  noticed  that  they  were  collections  of 
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miniature  “hooks”  that  had  locked  into  the 
“eyes”  in  the  cloth  of  his  pants  and  socks.  Mes- 
tral  used  that  knowledge  to  design  a  similar 
system  of  miniature  hooks-and-eyes  that  could 
be  used  as  a  fastener. 

Recent  observations  of  the  activities  of 
molecular  biology  laboratory  groups  have 
also  identified  a  preponderance  of  ncar/ex- 
planatory  analogies,  which  involve  the  use 
information  from  cither  the  same  domain  in 
a  different  context,  or  a  closely  related  source 
domain  to  understand  the  target  domain  (e.g., 
Dunbar,  1997).  Instances  of  these  types  of 
analogies  identified  by  Dunbar  include  a 
mapping  from  how  HIV  operates  in  an  in  vivo 
context  to  how  it  works  in  an  in  vitro  con¬ 
text,  and  a  mapping  between  the  Ebola  and 
Herpes  viruses. 

To  complete  the  set,  the  world  is  replete 
with  instances  of  ncar/inventive  analogies 
in  which  individuals  stay  within  a  domain, 
but  push  its  boundaries  by  envisioning  and 
bringing  to  fruition  novel  exemplars  of  that 
domain..  The  term  “inventive”  here  is  not 
used  to  restrict  these  types  of  analogies  to  the 
acts  associated  with  producing  patentable 
inventions,  but  rather  to  contrast  them  with 
those  analogies  designed  primarily  to  explain 
or  understand  a  phenomenon.  Thus,  when  an 
engineer  designs  a  new  gear,  a  novelist  crafts 
a  new  unlikely  hero,  or  a  country  singer  pens 
a  new  ballad,  their  creative  activities  can  all 
be  seen  as  instances  of  near/in ventive  analo¬ 
gy  use.  Examples  of  this  type  of  activity 
abound,  and  they  include  specific  cases  of  in¬ 
vention,  such  as  Thomas  Edison’s  patterning 
of  his  electric  light  distribution  system  after 
the  existing  gas  light  distribution  system  of  his 
day  (Friedel  &  Israel,  1979),  and  Eli  Whitney’s 
use  of  the  existing  charka  as  the  basis  for  his 
cotton  gin  (Basala,  1978).  They  also  include 
more  generic  tendencies,  such  as  science  fic¬ 
tion  writers’  reliance  on  Earth  animals  as  the 
bases  for  their  imaginary  extraterrestrials 
(Ward,  1994),  and  architects’  reliance  on  spe¬ 
cific  instances  of  prior  buildings  to  accomplish 
particular  goals  in  the  design  of  new  build¬ 
ings  (see  e.g.,  Kolodner,  1997). 


MENTAL  LEAPS,  MENTAL  HOPS, 
MAPPING  AND  ACCESS 

Considerable  research  has  focused  on  the 
use  of  analogy  in  reasoning  and  explanation, 
and,  at  least  from  the  examples  that  have  been 
described  most  often,  much  attention  has  been 
given  to  distant  analogies.  In  contrast,  the  cur¬ 
rent  presentation  will  focus  primarily  on  the 
sorts  of  products  that  emanate  from  near/inven¬ 
tive  uses  of  existing  knowledge,  with  a  particu¬ 
lar  emphasis  on  the  retrieval  of  highly  repre¬ 
sentative  domain  exemplars  as  sources  of  in¬ 
formation.  However,  it  will  also  briefly  attempt 
draw  out  connections  to  more  distant  and  ex¬ 
planatory  types  of  transfer,  and  to  delineate 
some  of  the  potential  variations  in  goals  and 
outcomes  across  the  situations.  To  what  extent 
is  the  transfer  of  old  knowiedge  to  new'  situa¬ 
tions  governed  by  similar  principles  across  the 
range  of  conceptual  distances  and  purposes? 

As  one  possible  difference  across  situa¬ 
tions,  it  is  reasonable  to  postulate  that  distant 
analogies  are  more  likely  to  be  associated  with 
extraordinary  forms  of  creativity,  w'hercas  near 
analogies  are  more  likely  to  be  associated  with 
everyday,  relatively  small  creative  increments. 
If  distant  analogies  are  seen  as  creative  “men¬ 
tal  leaps”  (e.g.,  Holyoak  &  Thagard,  1995),  in¬ 
tra-domain  conceptual  extensions  might  be  bet¬ 
ter  seen  as  creative  “mental  hops,”  with  less 
deviation  from  the  source  and  more  attributes 
preserved.  That  is,  because  the  objects  from 
distant  domains  will  differ  greatly  in  their  su¬ 
perficial  properties  w'hile  at  the  same  time  par¬ 
ticipating  in  comparable  relations,  only  the  lat¬ 
ter  will  tend  to  be  mapped  (Centner,  1989) 
across  distant  domains. 

In  contrast,  because  instances  from  the  same 
or  close  conceptual  domains  will  share  superfi¬ 
cial  as  well  as  deeper  similarities,  those  surface 
properties  are  more  likely  to  be  preser\'ed  in  the 
near  than  in  the  distant  case.  Put  differently,  the 
new  concept  that  results  from  the  analogy  pro¬ 
cess  will  generally  diverge  less  from  the  old  ones 
in  near  than  in  far  analogies.  Near  analogies  re¬ 
flect  more  of  a  literal  similarity  bctw'ecn  the 
source  and  target  (e.g.,  Centner,  1989),  they  may 
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represent  smaller  conceptual  changes  between 
the  old  and  new  ideas,  and  thus  may  be  seen  as 
less  dramatically  creative. 

Having  linked  near  analogies  to  smaller 
creative  advances,  however,  I  hasten  to  add  that 
this  in  no  way  diminishes  their  importance. 
Human  progress  is  certainly  much  indebted  to 
the  basic  propensity  to  innovative  in  small  in¬ 
cremental  steps  that  diverge  only  slightly  from 
what  has  come  before  (see  e.g.,  Basala,  1978). 

It  is  important,  too,  to  distinguish  the  con¬ 
ceptual  distance  between  old  and  new  ideas 
from  the  broader  impact  of  those  new  ideas. 
For  instance,  Edison’s  lightbulb  differed  only 
slightly  in  basic  form  from  several  less  success¬ 
ful  patented  versions  that  preceded  it.  Yet,  the 
end  result  of  widely  available  electric  light  had 
a  dramatic  effect  on  society.  Thus,  it  represent¬ 
ed  a  small  hop  from  what  had  come  before  con¬ 
ceptually,  but  a  giant  leap  in  terms  of  its  im¬ 
pact  on  the  world. 

Another  difference  across  the  types  of  sit¬ 
uations  is  that  the  inventive  case  seems  to  im¬ 
ply  less  initial  knowledge  about  the  structure 
of  the  target,  and  consequently,  a  more  limit¬ 
ed  role  for  an  initial  mapping  between  the 
source  and  target.  Unlike  the  case  of  explana¬ 
tory  analogies  that  presumably  arise  because 
there  are  observations  and  some  amount  of 
knowledge  about  a  target  domain  that  call  for 
further  explanation,  the  “targets”  or  products 
of  inventive  analogies  often  do  not  exist  until 
they  are  created  via  the  projection  of  structure 
from  the  source. 

For  example,  observations  about  planetary 
motion  existed  before  Kepler  applied  knowl¬ 
edge  about  light  to  explain  or  understand  those 
phenomena,  whereas  the  concept  of  velcro  did 
not  exist,  even  in  rudimentary  form  prior  to  de 
Mestral  realizing  that  the  structure  of  burrs 
could  be  adapted  to  produce  a  reusable  fasten¬ 
er.  Results  from  experiments  on  specific  dis¬ 
ease  processes  existed  to  be  explained  by  near 
analogies  to  other  known  disease  processes 
(Dunbar,  1997),  whereas  the  cotton  gin,  as  a 
specific  product,  did  not  exist  until  Whitney 
applied  knowledge  from  its  immediate  prede¬ 
cessor,  the  charka,  to  develop  it. 


Because  the  target,  perse,  tends  to  come 
into  being  in  the  inventive  case  as  a  result  of 
the  analogical  process,  determining  the  map¬ 
ping  between  source  and  target  domains  is 
somewhat  simplified  relative  to  the  explanato¬ 
ry  case  in  which  the  relational  structures  of  the 
source  and  target  must  be  structurally  aligned 
to  produce  an  effective  analogy  (e.g..  Centner 
&  Markman,  1997).  This  is  not  to  say  that  the 
goals  or  desirable  properties  of  inventions,  sto¬ 
ry  lines,  villains,  buildings,  and  so  on  are  not 
specified  in  advance  or  that  they  play  no  role 
in  adapting  the  structure  of  the  source  knowl¬ 
edge,  but  simply  that  mapping  between  domains 
is  minimized  and  projection  is  emphasized. 
Inventive  analogies  seem  to  reflect,  not  so  much 
a  process  of  comparison  of  structures  as  they 
do  a  process  of  projecting  or  instantiating  a 
known  structure  in  a  novel  way. 

Although  mapping  may  be  minimized,  a 
crucial  issue  for  inventive  analogies  (as  well  as 
explanatory  ones)  is  to  characterized  how  peo¬ 
ple  access  the  source  information.  What  fac¬ 
tors  determine  the  retrieval  of  the  information 
that  will  serve  as  the  basis  for  the  structure  of 
the  novel  product?  Here  too,  there  may  be  dif¬ 
ferences  across  situations. 

Similarity  of  surface  level  and  structural 
properties  between  the  target  and  source  is 
widely  acknowledged  as  being  crucial  to  re¬ 
trieving  sources  in  explanatory  analogical  rea¬ 
soning  (see  e.g.,  Dunbar,  1997;  Centner,  1989; 
Holyoak  &  Thagard,  1989;  1997;  Ross,  1989). 
However,  in  the  inventive  case,  the  target  only 
exists  after  the  fact,  and  similarity  to  the  source 
may  be  better  seen  as  the  consequence  rather 
than  the  cause  of  retrieving  a  particular  source. 
Alternatively,  if  the  goals  for  the  novel  product 
are  well-enough  specified,  and  the  person’s 
knowledge  is  indexed  in  a  way  to  allows  ac¬ 
cess  to  previous  cases  that  have  satisfied  those 
goals,  goal -related ness  might  drive  retrieval  in 
the  inventive  case  (see,  e.g.,  Kolodner,  1997). 

Beyond  similarity  to  the  target  and  the  ca¬ 
pacity  to  satisfy  the  goals  for  the  target,  retrieval 
of  source  information  may  well  be  determined 
primarily  by  the  properties  of  the  source  do¬ 
main  itself  as  well  as  more  general  conceptual 
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processing  tendencies,  such  as  a  reliance  on  the 
basic  level  of  categorization.  Without  a  rich 
target  representation  driving  the  retrieval  of  a 
highly  similar  source,  properties  of  the  source 
domain  itself  may  take  on  special  importance 
in  determining  what  gets  retrieved  and  used  in 
the  inventive  case.  In  the  next  sections  I  de¬ 
scribe  a  series  of  experiments  concerned  with 
the  near/inventive  use  of  existing  knowledge, 
and  I  discuss  one  particular  model  that  high¬ 
lights  the  role  of  the  graded  structure  of  source 
domains  and  the  retrieval  of  highly  representa¬ 
tive  instances  from  those  domains. 

NEAR/INVENTIVE  ANALOGICAL 
PROJECTION 

Because  the  products  of  near/inventive  cre¬ 
ative  endeavors  are  direct  outgrowths  of  the 
concepts  that  have  come  before,  they  can  be 
expected  to  share  important  properties  with 
previous  exemplars  of  those  concepts.  This  is 
true  of  real-world  accomplishments,  such  as 
inventions,  art,  music,  writing,  and  science  (e.g., 
Basala,  1988;  Friedel  &  Israel,  1986;  Weisberg, 
1986),  as  well  as  laboratory-based  performance 
observed  in  a  variety  of  generative  tasks  (e.g., 
Ward,  1994;  Ward  &  Sifonis,  1997). 

As  an  illustration  of  a  laboratory-based  study 
concerned  with  the  role  of  existing  knowledge 
in  near/inventive,  creative  generation,  Ward 
(1994)  asked  college  students  to  imagine,  draw, 
and  describe  animals  that  might  live  on  other 
planets.  Despite  the  fact  that  the  planets  were 
described  as  being  completely  different  from 
Earth,  Ward  found  that  the  students*  creations 
tended  to  be  strongly  analogous  to  Earth  ani¬ 
mals  in  many  respects.  At  the  level  of  superfi¬ 
cial  similarity  of  component  elements,  they  were 
very  likely  to  possess  standard  sensory  organs, 
such  as  eyes,  and  standard  appendages,  such  as 
legs  that  were  highly  similar  in  appearance  to 
their  counteipaits  in  Earth  animals. 

At  a  somewhat  deeper  level,  it  is  also  obvi¬ 
ous  from  the  participants’  drawings  and  descrip¬ 
tions  that  the  form  of  these  imagined  animals 
was  influenced  by  the  kinds  of  relational  struc¬ 
tures  that  connect  the  separate  elements  of  Earth 


animals.  That  is,  the  senses  and  appendages 
were  not  simply  scattered  about  randomly,  but 
rather  were  organized  into  symmetric  wholes 
within  bounded  solid  forms.  Likewise,  the  com¬ 
ponent  elements  of  the  creations  showed  a  kind 
of  one-to-one  correspondence  with  those  of 
Earth  animals  in  that  the  individual  sense  or¬ 
gans  and  appendages  tended  to  correspond  to 
single  matching  organs  and  appendages  of  Earth 
animals.  Eyes  matched  eyes  and  tended  to  serve 
only  the  single  function  of  extracting  visual 
information.  Legs  matched  legs,  and  tended  to 
serve  mobility  only. 

In  addition,  although  participants  did  not 
often  state  it  explicitly,  their  creations  also 
showed  a  kind  of  systematicity.  That  is,  clus¬ 
ters  of  symmetrically  placed  elements  seemed 
to  play  complementary  roles  within  broader 
goal  systems.  For  example,  the  eyes  serve  to 
collect  information  about  prey,  the  legs  allow 
an  approach  to  the  prey,  and  the  claws  provide 
the  capacity  to  grasp  it. 

It  is  important  to  note,  however,  that  de¬ 
spite  their  obvious  similarity  to  Earth  animals, 
the  imagined  animals  were  only  rarely  direct 
replicas  of  any  one  specific  Earth  animal.  Thus, 
they  possessed  some  degree  of  novelty,  while 
still  preserving  much  of  the  structure  of  the 
source  domain  of  Earth  animals. 

Although  with  hindsight,  these  results  are 
not  terribly  surprising,  it  is  important  to  note 
that  living  things  on  other  planets  could  con¬ 
ceivably  take  any  of  an  infinite  variety  of 
forms.  There  is  no  reason,  in  principle,  why 
they  would  have  to  resemble  Earth  animals  in 
their  surface  form.  Nevertheless,  people  pro¬ 
jected  many  of  the  characteristic  properties  of 
Earth  animals  onto  their  imagined  extraterres¬ 
trials.  Similar  results  have  been  found  with 
other  conceptual  domains,  such  as  faces  (Brc- 
dart.  Ward,  &  Marezewski,  in  press),  and  with 
other  age  groups,  such  as  young  children  (Cac- 
ciari,  Levorato,  &  Cicogna,  1997). 

Taking  the  properties  of  the  novel  creations 
collectively,  they  seem  to  reflect  an  instance  of 
analogical  projection  from  a  well-known  source 
domain  (Earth  animals),  to  a  relatively  un¬ 
known  target  domain  (extraterrestrials  from 
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planets  different  from  Earth).  That  is,  they  were 
structured  by  component  elements  that  were 
projected  in  way  that  preserved  structural  con¬ 
sistency  or  isomorphism,  as  well  as  a  high  level 
of  sytematicity,  which  have  been  identified  as 
important  ingredients  of  analogies  (e.g..  Cent¬ 
ner,  1989;  Centner  &Markman,  1997;Holyoak 
&Thagard,  1989;  1997). 

THE  PATH-OF-LEAST-RESISTANCE 

To  account  for  the  structuring  of  new  ideas 
by  old  information.  Ward  and  his  collabora¬ 
tors  have  proposed  the  path-of-least-resistance 
model  (Ward,  1994;  1995;  Ward  etal,  1997). 
According  to  this  model,  when  people  ap¬ 
proach  the  task  of  developing  a  new  idea,  their 
thinking  carries  them  down  paths-of-least-re- 
sistance  in  their  conceptual  representation  of 
the  most  relevant  knowledge  domains.  They 
are  assumed  to  gravitate  toward  fairly  specif¬ 
ic  (basic  level)  exemplars  of  the  concept,  and 
to  project  the  properties  of  those  instances  onto 
the  novel  ideas  they  are  developing.  For  ex¬ 
ample,  in  developing  imaginary  extraterrestrial 
animals,  rather  than  remaining  at  the  broad 
level  of  “animal”  people  tend  to  gravitate  to¬ 
ward  more  specific  categories  within  that  do¬ 
main,  and  to  highly  representative  instances, 
such  as  dogs  rather  than  less  representative 
ones,  such  as  iguanas. 

Although  there  are  many  different  mea¬ 
sures  of  representativeness  (Barsalou,  1985), 
the  one  Ward  et  al.  hypothesized  to  be  most 
predictive  was  Output  Dominance,  a  measure 
of  how  readily  instances  come  to  mind.  The  idea 
is  that  the  category  exemplars  that  come  to  mind 
most  readily  are  the  ones  most  likely  to  be  used 
as  starting  points  in  formulating  novel  ideas. 
The  rationale  is  that  generating  new  ideas  is 
cognitively  demanding,  and  people  tend  to  sim¬ 
plify  the  task  by  pursuing  ideas  that  come  readi¬ 
ly  to  mind. 

Ward  et  al.  (1997)  have  recently  provided 
support  for  the  path-of-least-resistance  model. 
They  first  determined  which  exemplars  were 
most  representative  of  the  domains  of  animals, 
tools,  and  fruit  by  having  college  students  list 


the  first  20  items  that  came  to  mind  for  each  of 
those  categories.  The  students’  responses  were 
then  tabulated  to  derive  Output  Dominance 
scores  for  each  exemplar,  that  is,  the  number 
of  students  listing  each  exemplar. 

The  prediction  from  the  path-of-least-re¬ 
sistance  model  was  that  the  items  that  were 
found  to  be  highest  in  Output  Dominance  would 
be  the  ones  most  likely  to  be  used  as  the  basis 
for  novel  ideas  in  tasks  of  imagination.  To  test 
the  prediction,  Ward  et  al.  (1997)  then  had  dif¬ 
ferent  groups  of  college  students  imagine  ani¬ 
mals,  tools,  and  fruit  that  might  exist  on  other 
planets.  In  addition  to  drawing  and  describing 
their  creations,  the  students  listed  all  of  the  fac¬ 
tors  they  could  think  of  that  influenced  them 
during  the  creation  process.  Those  statements 
were  then  examined  for  references  to  specific 
exemplars  from  those  domains  (e.g.,  dogs,  ham¬ 
mers,  apples,  and  so  on),  and  across  the  do¬ 
mains,  roughly  two-thirds  of  the  participants 
mentioned  relying  on  such  specific  exemplars. 

References  to  each  exemplar  were  then  tab¬ 
ulated  to  derive  a  measure  termed  Imagination 
Frequency,  which  is  an  indicator  of  the  likeli¬ 
hood  of  any  given  exemplar  being  used  as  a 
starting  point  for  a  novel  creation.  For  instance, 
of  the  college  students  who  developed  imagi¬ 
nary  animals,  seven  mentioned  that  they  based 
their  creations  on  dogs,  which  resulted  in  dog 
receiving  an  Imagination  Frequency  score  of 
7.  Across  all  three  domains  and  several  proce¬ 
dural  manipulations.  Imagination  Frequency 
scores  were  found  to  be  significantly  positive¬ 
ly  correlated  (in  the  .60  range)  with  Output 
Dominance  scores.  That  is,  the  students  tended 
to  rely  most  heavily  on  those  category  exem¬ 
plars  that  come  to  mind  most  readily. 

THE  UNCONSTRAINED  CASE 

Although,  the  global  findings  reveal  that 
many  people  retrieve  and  use  specific  category 
instances,  and  that  those  instances  tend  to  be 
highly  representative  ones,  considering  varia¬ 
tions  in  the  task  conditions  used  by  Ward  et  al. 
(1997)  can  provide  additional  insight  into  the 
factors  that  do  and  do  not  affect  what  people 
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retrieve  from  the  source  domains.  In  the  first 
experiment,  participants  imagined  animals  that 
might  live  on  other  planets,  but  they  were  giv¬ 
en  little  information  about  the  planets,  other 
than  the  fact  that  they  were  very  different  from 
Earth.  Participants  were  free  to  imagine  any 
creature  they  could,  with  no  constraints  on  what 
it  could  look  like,  in  what  type  of  environment 
it  might  need  to  survive,  and  so  on.  Consequent¬ 
ly,  it  is  possible  that  they  gravitated  toward  spe¬ 
cific,  highly  representative  Earth  animals  in  this 
unconstrained  case  largely  because  those  ani¬ 
mals  provided  an  easy  solution  to  the  task  at 
hand;  they  were  quickly  retrieved  from  memo¬ 
ry,  and  they  did  not  violate  any  specified  con¬ 
straints.  But  what  happens  to  retrieval  when 
various  constraints  are  imposed  or  when  addi¬ 
tional  information  about  the  target  is  given? 

DESIGN  CONSTRAINTS 

In  the  second  experiment  of  Ward  et  al. 
(1997),  participants  imagined  novel  tools  that 
might  be  used  by  a  species  of  intelligent  ex¬ 
traterrestrials.  Some  participants  were  given 
no  design  constraints,  whereas  others  were 
asked  to  imagine  tools  that  could  meet  the 
needs  of  an  alien  species  very  unlike  humans 
in  that  they  had  no  appendages.  The  idea  was 
that,  because  manipulation  by  way  of  hands 
is  a  central  property  of  standard  tools,  con¬ 
straining  participants  to  consider  such  a  crea¬ 
ture  might  encourage  them  to  move  away  from 
Earth  tool  exemplars.  Alternatively  however, 
the  tendency  to  rely  on  highly  retrievable  ex¬ 
emplars  of  the  domain  may  be  strong  enough 
that  it  remains  even  when  those  exemplars 
would  need  to  be  heavily  modified  to  meet  task 
constraints.  By  this  latter  view,  participants 
facing  the  constraint  may  be  just  as  likely  as 
unconstrained  participants  to  rely  on  Earth  tool 
models,  and  they  will  simply  modify  those 
exemplars  to  meet  the  needs  of  the  species. 

TTie  latter  view  clearly  won  out  in  this  par¬ 
ticular  experiment.  Those  participants  who  were 
constrained  to  design  tools  for  creatures  that 
had  no  appendages  were  just  as  likely  as  those 
who  faced  no  design  constraints  to  retrieve  spe¬ 


cific  instances  of  Earth  tools  as  starting  points, 
and  those  retrieved  tools  were  no  less  likely  to 
be  predominantly  high  in  Output  dominance. 
Thus,  the  relative  accessibility  of  category  ex¬ 
emplars  can  play  a  powerful  role  even  when 
other  situational  constraints  are  operative.  The 
path-of-Ieast-resistance  appears  to  be  a  seduc¬ 
tive  and  slippery  one. 

RETRIEVAL  CUES  FROM 
THE  TARGET 

It  is  important  to  note,  however,  that  the 
representativeness  of  instances  within  a  domain 
is  flexible  rather  than  rigid  (e.g.,BarsaIou,  1987) 
Consequently  it  ought  to  be  possible  to  bias 
people  to  retrieve  and  make  use  of  particular 
types  of  instances.  Ward  (1994)  explored  this 
possibility  by  providing  participants  with  ad¬ 
ditional  information  about  the  properties  of  the 
target.  Specifically,  different  groups  of  partici¬ 
pants  were  told  that  the  creature  to  be  imag¬ 
ined  had  feathers,  scales,  or  fur,  or  they  were 
given  no  information  about  its  attributes. 

The  subjects  in  the  •‘feathef  ’  condition  were 
significantly  more  likely  to  include  wings  and 
beaks  as  additional  features,  whereas  those  in 
the  ‘‘scales”  condition  were  significantly  more 
likely  to  include  fins  and  gills,  relative  to  those 
in  the  “fur**  or  control  conditions.  More  impor¬ 
tantly  for  present  purposes,  self-reports  indicat¬ 
ed  that  participants  tended  to  base  their  cre¬ 
ations  on  particular  instances  of  known  birds, 
fish,  or  mammals,  in  the  feather,  scales,  and 
fur  conditions,  respectively.  Thus,  the  differ¬ 
ent  cues  provided  about  the  target  led  to  the 
retrieval  of  different  instances  from  the  source 
domain  of  Earth  animals,  whose  properties  were 
then  mapped  onto  the  novel  entities. 

In  a  subsequent  experiment,  Ward  (1994) 
examined  the  interactive  effects  of  two  lypc^ 
of  information  about  the  target  domain  on  the 
retrieval  and  use  of  specific  instances:  one  was 
general  information  about  the  environment  on 
the  creature’s  planet,  and  the  other  was  specif¬ 
ic  attributes  of  the  imagined  creature  itself. 
Some  participants  were  told  that  the  planet  was 
composed  mostly  of  molten  rock  with  only  a 
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few  islands  of  solid  land.  To  obtain  enough 
food,  creatures  on  the  planet  needed  to  be  able 
to  travel  from  one  island  to  the  next.  Conse¬ 
quently,  being  able  to  fly  over  the  molten  rock 
would  be  an  adaptive  trait  and  participants  cre¬ 
ations  were  expected  to  be  highly  likely  to  fly. 

Other  participants  were  told  that  the  plan¬ 
et  had  violent  winds  blowing  all  around  it, 
from  just  a  couple  feet  above  the  surface  all 
the  way  up  to  the  upper  reaches  of  the  atmo¬ 
sphere.  Flight  on  such  a  planet  might  be  ex¬ 
pected  to  be  maladaptive  and  few  flying  crea¬ 
tures  were  expected. 

In  each  planet  condition,  some  participants 
were  given  a  specific  detail  about  the  target 
creature,  namely  that  it  had  feathers.  Others 
were  told  that  it  had  fur. 

The  most  important  findings  were  that  a) 
participants  in  the  Molten-Feather  and  Molten- 
Fur  conditions  were  highly  likely  to  design  fly¬ 
ing  extraterrestrials,  thus  showing  a  sensitivity 
to  the  design  constraints  in  the  task,  but  that  b) 
they  appeared  to  have  arrived  at  those  creations 
by  different  paths.  Participants  in  the  former 
group  were  more  likely  than  those  in  the  latter 
group  to  produce  creatures  that  were  classified 
as  birdlike,  and  to  report  basing  their  creations 
on  specific  instances  of  Earth  animals.  A  plau¬ 
sible  account  of  the  findings  is  that  the  pres¬ 
ence  of  the  cue  “feathers”  led  participants  to 
retrieve  exemplars  of  birds  which  would  have 
been  compatible  with  the  environmental  con¬ 
straints  of  the  Molten  planet  (i.e.,  safe  travel 
over  the  molten  areas  from  one  island  to  the 
next).  In  contrast,  the  cue  of  “fur”  may  have 
led  participants  to  initially  retrieve  mammali¬ 
an  exemplars  which,  with  the  exception  of  bats, 
would  not  possess  the  desired  attribute  of  flight. 
Consequently,  those  exemplars  would  have 
been  rejected  in  favor  of  a  different  starting 
point.  However,  because  the  cue  of  “fur”  also 
would  have  reduced  the  likelihood  retrieving 
birds,  birdlike  exemplars  would  have  been  un¬ 
likely  to  serve  as  that  next  starting  point.  Such 
conflicts  between  retrieved  exemplars  and  de¬ 
sired  properties  of  the  target  may  ultimately 
have  led  participants  to  construct  flying  crea¬ 
tures  on  the  basis  of  more  general  information 


about  flight  rather  than  on  the  basis  of  specific 
known  exemplars.  Thus,  the  end-product  would 
be  less  likely  to  resemble  a  bird. 

Participants  in  the  Windy  conditions  were 
less  like  to  produce  flying  creatures  and  less 
likely  than  those  in  the  Molten-Feather  condi¬ 
tion  to  report  a  reliance  on  specific  Earth  ani¬ 
mals.  Presumably,  those  in  the  Windy-Feather 
condition  might  also  initially  have  retrieved 
birdlike  exemplars,  but  would  have  rejected  or 
drastically  modified  them  because  of  their  in¬ 
compatibility  with  the  environmental  condi¬ 
tions  on  the  Windy  planet. 

In  general  then,  the  findings  suggest  that 
information  about  the  known  properties  of  tar¬ 
gets  (e.g.,  feathers)  and  about  other  task  con¬ 
straints  can  interact  to  determine  the  probabil¬ 
ity  that  people  will  make  use  of  particular  in¬ 
stances  from  the  source  domain.  Target  cues 
can  increase  the  likelihood  of  retrieving  source 
instances  that  have  properties  that  match  the 
cue.  When  other  salient  properties  of  those  re¬ 
trieved  exemplars  are  compatible  with  the  task 
constraints,  people  tend  to  rely  heavily  on  those 
specific  exemplars.  When  those  other  proper¬ 
ties  conflict  with  task  constraints,  reliance  on 
specific  exemplars  can  be  reduced, 

CONSTRAINTS  FROM  PERCEIVED 
TASK  DEMANDS 

It  may  seem  odd  that  people  would  grav¬ 
itate  toward  highly  representative  instances 
when  they  are  trying  to  be  creative.  Why  not 
shift  to  more  exotic  exemplars,  or  try  to  avoid 
them  entirely?  One  reason  that  people  may 
not  do  so  in  these  laboratory  tasks  is  that  they 
perceive  the  demands  of  the  tasks  different¬ 
ly  from  what  we  intended.  Perhaps  they  think 
that  they  are  supposed  to  use  representative 
exemplars  or  that  highly  original  products 
would  not  be  valued. 

To  examine  the  role  of  expectations.  Ward 
et  al.  (1997)  had  participants  design  imaginary 
fruit  under  different  instructional  conditions. 
Some  were  told  to  be  creative  and  others  were 
given  no  special  instructions.  The  results  were 
straightforward;  participants  who  were  given 
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the  creativity  instructions  were  just  as  likely  as 
control  participants  to  rely  on  highly  represen¬ 
tative  instances  of  Earth  fruit  in  designing  their 
own  creations.  Thus,  the  heavy  use  of  highly 
representative  instances  is  not  due  exclusively 
to  perceived  demand  characteristics.  More  gen¬ 
erally,  although  expectations  will  surely  mat¬ 
ter  in  some  real-world  and  laboratory  situations, 
category  structures  may  often  be  powerful 
enough  to  produce  large  effects  in  spite  of  those 
expectations. 

ACCESS  TO  SPECIFIC  INSTANCES  AND 
LIMITATIONS  ON  CREATIVE 
FUNCTIONING 

A  particularly  intriguing  finding  is  that 
those  participants  who  report  that  they  base 
their  creations  on  specific  exemplars  from  the 
source  categories  design  imaginary  products 
that  are  rated  as  showing  less  originality  than 
those  produced  by  participants  who  report  oth¬ 
er  types  of  approaches  (Ward,  1994;  Ward  et 
al.,  1997).  That  is,  their  creations  diverge  less 
from  the  characteristic  properties  of  known 
instances  from  the  source  domains.  Having 
brought  specific  instances  to  mind,  the  partic¬ 
ipants  tended  to  project  the  properties  of  those 
retrieved  instances  onto  their  novel  creations, 
with  the  consequence  that  those  creations 
showed  less  innovation  than  ones  produced  by 
participants  who  adopted  different  approach¬ 
es  to  the  task.  Thus,  it  appears  that  one  of  the 
major  constraints  on  generative  or  creative 
functioning  lies  in  our  natural  tendency  to  rely 
on  previous  examples  when  thinking  of  novel 
concepts  or  ideas.  More  original  products  can 
be  expected  to  result  when  people  avoid  the 
tendency  to  apply  the  first  available  represen¬ 
tation  to  a  problem  (Ward  &  Sifonis,  1997). 

STRATEGIES  AND  POPULATION 
EFFECTS 

Relying  on  specific,  highly  representative 
exemplars  of  a  known  concept  and  projecting 
properties  from  those  exemplars  onto  novel 
creations  should  be  seen  as  strategic  choices. 


More  creative  individuals  may  be  expected  to 
be  more  flexible  in  the  use  of  their  conceptual 
knowledge,  better  able  to  avoid  reliance  on 
representative  instances,  and  less  likely  to 
project  characteristic  properties  from  specific 
exemplars.  To  examine  this  possibility  we 
have  recently  observed  the  performance  of 
gifted  adolescents  (who  can  be  hypothesized 
to  possess  that  cluster  of  conceptual  abilities) 
in  the  imaginary  fruit  task  (Ward,  Saunders, 
&  Dodds,  in  press). 

The  gifted  participants  showed  a  balance 
between  flexibility  and  rigidity  in  the  way  they 
approached  the  design  task.  That  is,  they  were 
less  likely  than  our  typical  college  student  sam¬ 
ples  to  rely  on  specific  types  of  Earth  fruit. 
However,  when  they  did  so,  they  were  just  as 
likely  to  gravitate  to  the  items  that  come  to  mind 
most  readily,  that  is,  that  arc  highest  in  output 
dominance.  The  correlations  between  Imagina¬ 
tion  Frequency  and  Output  Dominance  scores 
for  Earth  fruit  were  nearly  identical  to  those 
found  for  college  students. 

ABSTRACTION  AND  CONCEPTUAl. 

DISTANCE 

The  path -of-least-rcsi stance  model  implies 
that  people  should  be  able  to  develop  more  cre¬ 
ative  ideas  by  moving  back  up  the  path  in  the 
conceptual  hierarchy  to  more  abstract  levels. 
Properties  from  any  level  will  be  projected  onto 
the  novel  entity  being  constructed,  but  they  will 
be  less  specific,  and  thus  less  constraining  at 
more  abstract  levels.  For  example,  patterning 
of  a  novel  creature  after  a  dog  might  lead  to  the 
projection  of  two  eyes  placed  symmetrically  in 
the  head,  whereas  projection  from  “living  thing” 
might  lead  to  the  projection  of  “taking  in  infor¬ 
mation  about  the  environment,”  a  less  con¬ 
straining  property  that  could  be  instantiated  in 
an  indefinite  number  of  ways. 

Moving  back  up  the  path  might  be  thought 
of  as  enhancing  originality  by  shifting  the  case 
from  a  near  analogy  to  a  far  one.  At  a  specific 
level,  such  as  “dog,”  if  the  person  imports  infor¬ 
mation  from  yet  another  source  to  bolster  the 
originality  of  the  creation,  it  is  likely  to  be  a 
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source  in  the  same  superordinate,  such  as  a  “cat.” 
The  higher  the  level,  the  broader  the  superordi¬ 
nate  is  and  the  more  distant  that  other  source  can 
be.  At  a  very  broad  level,  such  as  “living  thing,” 
the  immediate  superordinate  might  be  as  broad 
as  “physical  entity”  which  could  open  the  possi¬ 
bility  of  importing  information  from  a  quite  dis¬ 
tant  domain,  such  as  “nonliving  thing”  (e.g., 
wheels  for  appendages).  In  so  doing,  the  length 
of  the  mental  hop  can  be  increased  so  that  it  more 
approximates  a  mental  leap. 
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Many  studies  have  shown  that  problem 
solving  by  analogy  is  facilitated  when  a  sche¬ 
ma  that  is  potentially  applicable  to  a  class  of 
problems  is  constructed,  i.e.,  when  the  subject 
builds  an  abstract  representation  structure  that 
includes  the  goals  and  subgoals  to  be  reached, 
the  requirements  to  be  met,  and  the  strategy  to 
implement  (e.g.,  Gick  &  Holyoak,  1983;  Cum¬ 
mins,  1992).  Nevertheless,  a  hypothesis  recent¬ 
ly  set  forth  by  many  authors  (e.g..  Brooks,  Nor¬ 
man,  &  Allen,  1991;  Gobet  &  Simon,  1996a, 
1996b;  Pierce  et  al.,  1996;  Anderson,  Fincham, 
&  Douglass,  1997)  is  that  several  representa¬ 
tion  structures  with  different  levels  of  abstrac¬ 
tion  may  in  fact  coexist,  including  special  cas¬ 
es  elaborated  at  a  low  level  of  abstraction.  De¬ 
pending  on  the  extent  to  which  the  to-be-solved 
target  problem  resembles  the  corresponding 
source  problems,  one  or  the  other  of  these  forms 
of  representation  will  take  precedence.  When 
the  target  problem  is  recognized  as  familiar,  an 
already  processed  case  would  be  searched  for 
and  adapted  to  it.  But  when  the  problem  can¬ 
not  be  connected  to  a  known  case,  an  abstract 
schema  would  be  applied  and  instantiated  (pro¬ 
vided,  of  course,  that  such  a  schema  exists  in 
long-term  memory).  There  is  still  little  experi¬ 
mental  data  in  support  of  this  hypothesis,  but  it 
appears  plausible  and  tempting  from  the  stand¬ 
point  of  cognitive  efficiency:  it  is  less  costly 
and  faster  to  adapt  a  known  case,  if  possible, 
than  it  is  to  systematically  reconstruct  or  re¬ 
calculate  the  solving  process  by  applying  and 
instantiating  an  abstract  schema.  Moreover,  this 
second  hypothesis  helps  account  for  the  fact  that 
novices  (who  do  not  yet  have  an  abstract  sche¬ 


ma)  manage  to  solve  problems  when  they  are 
very  similar  to  the  source  (e.g.,  Reed  &  Bol- 
stad,  1991). 

The  experiment  reported  here  provides  ad¬ 
ditional  arguments  in  favor  of  this  hypothesis. 
Starting  from  the  same  source  problem,  we  at¬ 
tempted  to  lead  subjects  to  construct  knowledge 
at  different  levels  of  abstractness.  By  means  of 
various  measures,  we  then  tried  to  evaluate  the 
specific  and/or  general  knowledge  they  con¬ 
structed  and  used  in  solving  structurally  isomor¬ 
phic  problems. 

Here  is  an  overall  view  of  the  experiment. 

Subjects  had  to  find  the  solution  to  a  par¬ 
ticular  chess  problem:  attaining  “smothered 
mate  with  sacrifice”  near  the  end  of  a  chess 
game. 

The  subjects’  first  task  was  to  understand 
this  source  problem.  One  group  of  subjects  was 
given  an  explanation  of  the  problem  that  fo¬ 
cused  on  the  sequence  of  elementary  solving 
steps.  For  the  second  group,  the  explanation 
consisted  of  describing  the  general  principle 
behind  smothered  mate  with  sacrifice  and  il¬ 
lustrating  it  with  this  same  source  problem.  This 
second  experimental  condition,  likely  to  trig¬ 
ger  self-explanations  aimed  at  linking  the  ex¬ 
ample  to  the  general  principle,  was  expected  to 
promote  the  construction  of  an  abstract  sche¬ 
ma  (e.g.,  Brown  &  Kane,  1988). 

Next  the  subjects  had  to  solve  two  new 
problems,  one  that  was  “  like  ”  the  source  prob¬ 
lem  both  in  its  structural  and  visual  features, 
and  one  that  looked  different  on  the  surface  but 
was  in  fact  structurally  isomorphic.  The  hypoth¬ 
esis  was  that  subjects  given  the  general  solving 


231 


Evciync  Cauzlnille-Marm^chc  and  Andr^  Didicrjcan 


principle  would  solve  the  “unlike”  problem 
better  than  subjects  in  the  other  group.  It  was 
also  hypothesized  that  these  subjects  would  do 
better  on  the  “like”  problem,  because  the  terms 
introduced  to  explain  the  solving  principle 
(“smothering”,  “sacrifice”,  etc.)  and  to  describe 
the  final  goal  and  the  various  subgoals  were 
expected  to  promote  the  encoding  of  the  spe¬ 
cific  features  of  the  source  problem  and  there¬ 
by  facilitate  its  retrieval  and  adaptation  to  the 
processing  of  problems  recognized  as  similar 
(e.g.,  Catrambone,  1995, 1996). 

After  solving  the  two  problems  (like  and 
unlike),  subjects  had  to  recall  the  source  exam¬ 
ple  as  accurately  as  possible.  This  phase  allowed 
us  to  determine  what  specific  aspects  of  the 
problem  were  stored  in  long-term  memory.  Our 
hypothesis  was  that  subjects  who  had  been  told 
the  general  principles  underlying  the  solution 
would  remember  the  source  problem  better, 
since  they  have  payed  more  attention  to  the  rel¬ 
evant  pieces  of  the  chessboard. 

Finally,  subjects  had  to  order  a  set  of  new 
problems  according  to  how  much  they  resem¬ 
bled  the  source  problem  (in  terms  of  the  similar¬ 
ity  of  the  solving  process).  The  problems  to  rank 
differed  from  the  source  problem  in  their  sur¬ 
face  features  and/or  in  their  structure.  The  hy¬ 
pothesis  was  that  subjects  who  had  constructed 
an  abstract  schema  would  primarily  use  struc¬ 
ture  as  a  criterion  forjudging  problem  resem¬ 
blance  (e.g.,  Chi,  Feltovitch  &  Glaser,  1981). 

This  experimental  setup  —  in  which  a  lot 
of  measures  allowed  us  to  assess  the  specifici¬ 
ty  of  the  knowledge  constructed  while  others 
served  to  evaluate  its  generality  —  should  pro¬ 
vide  insight  into  the  representation  levels  elab¬ 
orated  during  the  acquisition  of  micro-exper¬ 
tise,  and  their  use  in  problem  solving. 

METHOD 


Subjects 

Forty-four  psychology  students  (mean  age: 
23  years  4  months,  standard  deviation:  11 
months)  participated  in  the  experiment.  All  sub¬ 


jects  judged  themselves  to  be  novices  in  chess 
(having  played  less  than  once  a  year)  but  were 
familiar  with  the  rules. 

Procedure 

The  experiment  was  run  in  a  single  session 
lasting  approximately  one  hour.  Subjects  were 
tested  individually. 

After  a  familiarization  phase,  subjects  had 
to  analyze  a  source  example.  In  the  first  step  sub¬ 
jects  searched  for  the  solution  to  the  example 
problem  presented  on  a  chessboard,  i.c.,  how  the 
white  player  could  put  the  black  king  in  check¬ 
mate  in  a  few  moves.  None  of  the  subjects  found 
the  solution  in  the  allotted  lime  (I  min  ).  The 
second  step  involved  explaining  to  half  of  the 
subjects  (“Case”  condition)  the  exact  solution 
procedure  for  this  particular  example,  and  to  the 
other  half  (“Principle”  condition),  the  general 
principle  of  smothered  mate  with  sacrifice,  il¬ 
lustrated  with  the  example.  The  subjects  then  had 
to  reproduce  the  correct  procedure  on  the  chess¬ 
board  while  explaining  the  moves. 

Then,  the  subjects  had  to  solve  two  prob¬ 
lems,  one  “  like  ”  the  example  (both  in  its  struc¬ 
tural  and  visual  features)  and  one  “  unlike  ”  the 
example  (a  problem  that  looked  different  on  the 
surface  but  was  in  fact  structural  ley  isomor¬ 
phic).  The  time  limit  was  set  at  4  minutes  per 
problem.  Whenever  the  correct  solution  was 
found,  the  solving  time  was  recorded. 

After  this  problem  solving  phase,  the  sub¬ 
jects  were  given  an  empty  chessboard  and  the 
complete  set  of  chessmen,  and  were  asked  to 
recall,  as  fully  and  accurately  as  possible,  the 
layout  of  the  example  initially  explained  by  the 
experimenter. 

Finally,  the  example  layout  was  presented 
to  the  subject,  and  he  or  she  was  also  given  three 
other  layouts  and  asked  to  order  them  in  de¬ 
creasing  order  of  similarity  (in  terms  of  the  re¬ 
quired  solving  steps)  to  the  example  layout. 

Summary  of  resufts 

Table  I  summarizes  the  results.  The  sub¬ 
jects  in  the  two  groups  (Principle  and  Case) 
were  distinguished  on  the  basis  of  their  perfor- 
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Principle 
Group 
(N  =  22) 

Case 

Group 
(N  =  22) 

Performance  profile 

(n=7) 

+  + 

(n=I0) 

+  - 

(n=5) 

(h=I) 

+  + 

(n=9) 

+  - 

(n=12) 

High  recall 

71% 

60% 

80% 

100% 

44% 

42% 

Reconstruction  capability 

86% 

70% 

40% 

100% 

44% 

25% 

Structural  criteria 

86% 

80% 

80% 

100% 

67% 

/ 

42% 

Table  1.  Summary  of  results  for  the  two  groups  of  subjects  (Principle  and  Case):  perfonnance  profiles  on  like  and 
unlike  problems,  recall  test  performance,  reconstruction  capability,  and  similarity  based  on  structural  criteria. 


mance  profile  on  like  and  unlike  problems  (+  +, 
success  on  the  two  kinds  of  problems,  +  suc¬ 
cess  only  on  the  “  like  ”  problem,  -  failure  on 
the  two  problems).  In  each  case,  the  table  gives 
(i)  the  percentage  of  subjects  with  a  “high”  score 
on  the  recall  test  (at  least  four  pieces  placed  in 
the  correct  location),  (ii)  the  percentage  of  sub¬ 
jects  capable  of  reconstruction  (they  put  at  least 
one  relevant  piece  in  a  logical  location  that  did 
not  change  the  structure  of  the  game),  and  (iii) 
the  percentage  of  subjects  whose  similarity  or¬ 
der  placed  priority  on  structure. 

These  results  indicate  some  very  important 
differences  between  the  two  experimental  con¬ 
ditions,  “  Principle  ”  and  “  Case  ”.  Differences 
were  found  not  only  in  the  scores  on  the  like  and 
unlike  problems,  but  also  on  example  recall  and 
on  judgments  of  new  problem  similarity. 

Among  the  subjects  who  succeeded  on  both 
types  of  problems  —  all  but  one  of  whom  be¬ 
longed  to  the  Principle  group  —  most  seemed 
to  remember  the  example  well  and  were  capa¬ 
ble  of  reconstructing  the  game  without  chang¬ 
ing  its  structure.  In  addition,  most  subjects  iden¬ 
tified  the  structure  of  the  new  problems  and 
used  this  criterion  to  determine  how  close  they 
were  to  the  example. 

For  subjects  who  succeeded  on  the  like  prob¬ 
lem  only  or  who  failed  on  both  problems  (both 
groups  contained  such  subjects),  the  results 
showed  that  there  were  still  large  differences 


between  the  two  conditions  on  the  recall  and  sim¬ 
ilarity  tasks.  Case  group  subjects  who  only  suc¬ 
ceeded  on  the  like  problem  exhibited  poorer  per¬ 
formance  on  the  recall  and  similarity  tests  than 
Principle  group  subjects  with  the  same  profile: 
Case  group  subjects  remembered  the  example 
less  accurately,  were  less  often  capable  of  re¬ 
construction,  and  outnumbered  the  others  in  re¬ 
lying  on  surface  features  to  decide  how  similar 
the  new  problems  were  to  the  example.  The  same 
types  of  differences  between  the  two  groups  were 
observed  for  subjects  who  were  unable  to  cor¬ 
rectly  solve  either  problem.  While  the  non-solv¬ 
ers  in  the  Principle  group  remembered  the  ex¬ 
ample  well  and  some  of  them  were  able  to  re¬ 
construct  the  game,  the  corresponding  Case 
group  subjects  did  not  remember  the  example 
as  well  and  very  few  of  them  could  reconstruct. 
In  addition,  the  non-solvers  in  the  Principle  group 
primarily  used  a  structural  criterion  forjudging 
new  problem  similarity,  whereas  a  majority  of 
the  non-solving  Case  subjects  relied  mainly  on 
surface  criteria. 

DISCUSSION 

This  experiment  pointed  out  the  existence 
of  different  ways  of  solving  problems  by  analo¬ 
gy:  the  use  of  an  abstract  schema  and/or  adapta¬ 
tion  of  a  source  case.  Various  measures  enabled 
us  to  identify  different  forms  of  source  example 


233 


Evclync  CauzinlUe-Marm^che  and  Andr^  Pidicrjcan 


processing,  storage,  and  retrieval  for  solving  new 
isomorphic  problems.  These  experimental  results 
thus  support  the  hypothesis  that  during  learning, 
representation  structures  of  different  levels  of 
abstraction  are  elaborated,  and  that  access  to 
these  different  structures  depends  on  the  simi¬ 
larity  of  the  to-be-solved  target  problem  to  the 
already-processed  source  problem. 

We  devised  an  experimental  setup  that  led 
subjects  to  use  different  encoding  methods  to 
learn  the  example  problem,  which  involved 
winning  a  chess  game  in  a  given  way.  Some 
subjects  were  simply  shown  the  remaining  steps 
needed  to  win  in  this  particular  case.  Others 
were  given  the  general  solving  principle  for  this 
type  of  game  ending  (smothered  mate  with  sac¬ 
rifice),  which  was  illustrated  using  the  same 
example.  Learning  was  assessed  by  having  sub¬ 
jects  solve  two  new  problems  from  the  same 
problem  class,  one  like  the  example  in  its  sur¬ 
face  features  and  structure  and  one  that  was 
superficially  unlike  the  example  but  was  iso¬ 
morphic  to  it  from  the  structural  standpoint. 

The  results  showed  that  some  subjects  cor¬ 
rectly  solved  both  types  of  target  problems,  oth¬ 
ers,  only  the  like  problem,  and  still  others,  nei¬ 
ther  problem.  In  line  with  our  assumption  that 
exposure  to  an  abstract  principle  promotes 
learning  (e.g.,  Clement,  1994;  Catrambone, 
1995, 1996),  all  subjects  who  succeeded  on  both 
types  of  problems  (except  one)  were  subjects 
who  had  been  presented  with  the  abstract  solv¬ 
ing  principle.  These  results  are  thus  compati¬ 
ble  with  the  hypothesis  that  to  be  able  solve  all 
problems  in  the  problem  class  studied  here,  no 
matter  how  close  the  target  problems  are  to  the 
source,  it  is  necessary  to  construct  an  abstract 
solving  schema. 

However,  mere  exposure  to  the  abstract 
principle  did  not  induce  a  knowledge  level  in 
all  subjects  that  enabled  them  to  solve  both 
problems.  Many  subjects  only  succeeded  on  the 
like  problem,  and  others,  on  neither  problem. 
Subjects  in  the  group  that  was  only  given  the 
specific  procedure  for  solving  the  example, 
failed  on  one  or  both  problems. 

Subjects  were  found  to  be  sensitive  to  sur¬ 
face  similarities  between  the  target  problem  and 
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the  example,  even  those  who  succeeded  on  both 
new  problems.  Solving  times  were  shorter  for 
the  like  problem  than  for  the  unlike  one.  These 
results  support  the  hypothesized  existence  of 
two  distinct  processes:  (I)  a  search-and-adapt 
process  that  searches  for  the  case  and  adapts  it 
to  the  new  problem,  and  (2)  an  apply-and-in- 
stantiate  process  that  applies  an  abstract  sche¬ 
ma  and  instantiates  it  with  the  specific  data  from 
the  target  problem.  Subjects  may  rely  on  one 
or  the  other  process,  depending  on  how  similar 
the  target  is  to  the  example  (e.g..  Brooks,  Nor¬ 
man,  &  Allen,  1991,  Gobet  &  Simon,  1996a, 
1996b;  Pierce  et  al.,  1996).  When  the  problem 
to  be  solved  is  deemed  by  the  subjects  to  be 
like  an  already  learned  problem,  they  access  that 
problem  and  attempt  to  adapt  it  to  the  solution. 
When  the  to-be-solved  problem  differs  from  the 
source  problem  in  its  surface  features,  subjects 
access  the  abstract  schema  and  attempt  to  ap¬ 
ply  it,  while  taking  the  specific  features  of  the 
new  problem  into  account.  For  subjects  who 
succeeded  on  one  problem  only,  it  was  always 
the  like  problem.  So  these  subjects  must  not 
have  built  an  abstract  schema  and  were  thus  lim¬ 
ited  to  adapting  the  solving  procedure  of  the 
example  to  the  target  problem.  This  was  only 
possible  when  both  the  target  problem’s  sur¬ 
face  features  and  structure  were  very  similar  to 
those  of  the  example.  For  subjects  who  failed 
on  both  problems,  adaptation  of  the  example 
to  the  target  problem  must  not  have  been  pos¬ 
sible,  even  when  the  two  were  very  similar  (e.g., 
Reed  &  Bolstad,  1991). 

The  data  we  collected  provide  further  in¬ 
sight  into  the  nature  of  the  representations  con¬ 
structed  and  used  by  subjects.  Our  results  clear¬ 
ly  showed  that  subjects  who  succeeded  on  the 
like  and  unlike  problems  had  acquired  general 
knowledge  for  solving  problems  in  this  class 
as  reflected  by  (a)  the  fact  that  they  were  able 
to  reconstruct  the  example  without  changing  its 
structure,  even  if  the  pieces  were  not  placed  in 
their  correct  locations,  and  (b)  the  fact  that  they 
were  able  to  assess  the  similarity  of  new  prob¬ 
lems  on  the  basis  of  a  structural  criterion.  But 
these  subjects  were  also  the  ones  who  were 
better  at  remembering  the  source  problem:  they 
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stored  the  relevant  features  of  this  specific  ex¬ 
ample  in  memory.  Concerning  the  subjects  who 
only  succeeded  on  the  like  problem,  our  analy¬ 
ses  pointed  out  substantial  differences  in  the 
way  the  problems  were  encoded,  depending  on 
whether  or  not  the  subjects  had  benefited  from 
exposure  to  the  abstract  solving  principle.  Those 
subjects  who  had  been  exposed  to  the  general 
principle  usually  remembered  the  example 
more  accurately  than  the  other  subjects  did;  they 
were  also  better  able  to  reconstruct  the  exam¬ 
ple  without  changing  its  structure,  and  they 
placed  more  priority  on  structure  in  judging  how 
similar  new  problems  were  to  the  example. 
Analogous  results  were  obtained  for  subjects 
who  could  not  solve  either  problem.  It  thus  ap¬ 
pears  as  though  the  mere  analysis  of  the  sub¬ 
jects’  performance  profiles  on  problems  that  are 
like  and  unlike  the  example  is  insufficient  for 
determining  how  the  example  was  encoded. 
This  brings  us  to  the  more  general  issue  of  how 
source  problems  are  encoded  when  a  new  class 
of  problems  is  being  learned. 

In  attempting  to  define  the  different  encod¬ 
ing  modes  used  in  problem  solving  and  deter¬ 
mine  how  they  evolve  with  learning,  it  would 
certainly  be  a  gross  oversimplification  to  dis¬ 
tinguish  only  the  storage  of  special  cases  and 
the  building  of  abstract  schemas.  In  line  with 
classical  theories  of  memory  (see  Tulving  & 
Thompson,  1973;  Tulving,  1985;  for  a  review 
see,  Tiberghien,  1997),  it  would  no  doubt  be 
more  useful  to  hypothesize  that  there  is  co-con¬ 
struction  and  co-existence  of  different  types  of 
problem  encoding,  some  more  perceptual  in 
nature,  others  more  episodic  and  procedural, 
and  still  others,  more  semantic  and  conceptual. 
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INTRODUCTION  tricity  and  flowing  water  is  a  good  example  of 

analogy  between  modified  systems.  However, 

In  this  paper,  our  attention  will  be  centered  jg  regrettable  that  this  context  of  reasoning  is 

around  how  conceptualization  comes  into  be-  seldom  exploited  in  psychological  research 

ing  and  can  be  a  part  of  the  analogical  reason-  jg  frequent  approach  in  scien- 

ing  process.  rifle  modeling  (think  of  the  A.I  metaphor  of 

human  cognition,  with  its  basic  isomorphism, 

THREE  ANALOGICAL  SYSTEMS  and  the  large  scope  of  possible  variations). 

We  will  distinguish  three  types  of  situations 
upon  which  an  analogy  takes  place.  This  will  be 
done  on  the  criteria  of  their  relational  structure. 

Analogy  between  proportions 

It  is  the  classical  schema  “A  is  to  B  what  C 
is  to  D”  (figure  1).  In  this  kind  of  analogy,  there 
is  one  invariant  (explicit  or  not)  that  permits  a 
similitude  between  two  pairs  of  objects. 

Analogy  between  systems 

It  is  an  isomorphism  between  two  relational 
structures  (figure  2),  as  it  can  be  seen  in  Ruth¬ 
erford’s  solar  system  and  atom  (Gentner,  1983), 
or  in  Gick  &  Holyoak’s  (1983)  fortress  attack 
and  radiation  problems.  Most  of  the  experi¬ 
ments  and  theories  on  analogical  reasoning  are 
based  on  this  type  of  situation. 

Analogy  between  modified  systems 

A  new  isomorphism  is  derived  from  anal¬ 
ogous  situations,  each  modified  by  analogous 
transformations  (figure  3).The  new  pair  of  anal¬ 
ogous  situations  can  then  be  considered  as  a 
complexification  of  the  initial  one  :  more  ob¬ 
jects  and  properties  are  to  be  considered.  Cen¬ 
tner’s  (1983)  use  of  the  analogy  between  elec- 


Fig.  2.  Analogy  between  systems 


Fig.  3.Analogy  between  modified  systems 


Fig.  J.  Analogy  between  proportions 
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CONCEPTUALIZATION  IN  MODIHED 
ANALOGOUS  SYSTEMS 

We  have  examined  logical  aspects  of  analog¬ 
ical  situations ;  we  will  now  consider  psychologi¬ 
cal  aspects,  in  particular  with  regard  to  the  analogy 
between  modified  systems,  where  the  question  of 
conceptualization  arises  in  a  very  acute  way. 

Issue 

We  would  like  to  introduce  the  theoretical 
issue  with  one  of  the  results  obtained  by  Cent¬ 
ner  and  Centner  (1983). 

Their  goal  was  to  “test  the  Cenerati  ve  Anal¬ 
ogy  hypothesis  :  that  conceptual  inferences  in 
the  target  follow  predictably  from  the  use  of  a 
given  base  domain  as  an  analogical  model.  To 
confirm  this  hypothesis,  it  must  be  shown  that 
the  inferences  people  make  in  a  topic  domain 
vary  according  to  the  analogies  they  use”  (p.  1 00). 
More  precisely,  among  other  predictions  they 
thought  that  giving  subjects  a  hydraulic/electric¬ 
ity  analogy  would  facilitate  their  inferences  when 
one  battery  was  added  in  series  or  in  parallel  to 
the  electric  circuits  (thus  making  two  different 
complex  situations)  .  The  authors’  reason  was 
that  “with  reservoirs,  the  correct  inferences  for 
series  versus  parallel  can  be  derived  by  keeping 
track  of  the  resulting  height  of  wateri’.  This  pre¬ 
diction  was  not  supported  by  the  results  (but  oth¬ 
ers  were,  with  other  analogies).  The  authors  put 
forward  two  possible  interpretations  for  this :  the 
lack  of  knowledge  in  the  source  domain  and  the 
failure  to  notice  and  use  the  analogy. 

Even  if  it  may  not  be  determinant  in  this 
particular  context'  we  would  like  to  suggest 
another  reason  to  illustrate  a  paradoxical  aspect 
in  analogical  reasoning. 

When  introducing  to  the  reader  the  hydraulic 
conceptual  field  as  an  electric  analog,  the  authors 
proposed  to  “consider  what  happens  when  two 
reservoirs  are  connected  in  series,  one  on  top  of 
the  other”  (p.  113,  stress  is  mine).  The  fact  is: 
how  can  one  “guess”  that  reservoirs  in  series  are 

*  What  the  subjects  were  taught  during  the  training 
phase  of  the  experiment  was  not  detailed  in  the  article.  Were 
they  taught  only  the  simple  analogy,  or  the  complex  one  as 
well  (with  more  elements  in  the  circuit)? 


placed  one  on  top  of  the  other,  but  not  one  behind 
the  other  like  in  the  spatial  organizxition  of  batter¬ 
ies,  for  example.  The  reason  for  the  right  config¬ 
uration  lies  in  the  correspondence  “prcssure/volt- 
age”  and  in  the  fact  that  doubling  the  height  of 
water  doubles  it.s  pressure,  in  the  same  way  that 
two  batteries  in  series  double  the  voltage.  Thus,  it 
appears  that  the  solution  to  the  problem  is  required 
in  order  to  make  the  right  analogy.  This  consti¬ 
tutes  a  paradox  as  it  is  generally  considered  that  it 
is  the  use  of  the  right  analog  that  triggers  the  solu¬ 
tion  to  the  problem. 

Thus,  we  have  sought  to  know  whether  this 
paradox  was  experienced  by  the  individuals  dur¬ 
ing  the  analogical  reasoning,  and  if  so,  how  it  is 
dealt  with.  Tackling  this  question  (which,  in  our 
view,  has  been  omitted  in  psychological  research, 
even  if  Qement’s  (1988)  work  comes  close)  re¬ 
quires  a  detailed  qualitative  approach.  This  is  why 
we  chose,  as  a  first  step,  to  proceed  by  case  studies. 

Expertise 

The  conceptual  domains  used  for  the  ex¬ 
periment  are  fluid  flow  and  heat  flow.  The  main 
characterization  of  these  phenomena  is  the  evo¬ 
lution  towards  a  balance  state  between  a  source 
(S)  and  a  receptor  (R),  linked  together  with  an 
intermediary  clement  (I).  We  will  introduce 
them  in  a  phenomenological  way,  without  us¬ 
ing  the  mathematical  relationships  usually  used 
to  describe  the  physical  laws. 

The  heat  flow 

The  complex  phenomenon  to  be  simulated 
with  a  hydraulic  setting  is  the  heat  flow  emanat¬ 
ing  from  a  source  through  any  material,  a  piece  of 
wood  for  example.  Figure  4  represents  the  setting 
for  this  thermal  phenomenon  fThl],  also  show¬ 
ing  the  heat  flow  direction,  which  we  know  is 
orientated  towards  the  colder  part  of  the  material. 


tTh2] 

heat  flow 
direction 


Material  :  R 

imvlatlng 

material 

Source 
of  heat  :S 


Fig.  4.  A  complex  thermal  setting  -  any  material 
(piece  of  wood) 
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The  objective  is  to  study  the  spread  of  heat 
in  the  material,  using  timed  measurements  of 
the  temperature  (T),  at  different  points  (ideal¬ 
ly  an  infinite  number)  in  between  the  source 
of  heat  and  the  top  of  the  material.  Thus,  the 
temperature  function  has  one  variable  :  the 
time  (t),  and  one  parameter  :  the  distance  from 
the  source  (d). 

As  often  done  in  physics,  before  studying 
the  overall  system  we  will  consider  the  “small¬ 
est”  possible  part  of  it :  a  “slice”,  which  will 
constitute  the  simple  system.  A  simple  thermal 
setting  ([Thl],  figure  5)  can  be  made  to  exper¬ 
iment  on  how  the  heat  behaves  in  a  slice  of 
material,  by  modeling  the  decomposition  of  its 
two  appropriate  material  properties  toward  the 
heat  flux: 

-  the  conductivity  (K,  how  much  heat  is 
passing  through  the  material  per  unit  of  time) 
will  be  represented  by  a  piece  of  paper  (I)  act¬ 
ing  as  intermediary  and  whose  capacity  can 
be  neglected 

-  the  thermal  capacity  (C,  number  of  heat 
units  required  to  raise  the  temperature  of  the 
material  by  one  degree)  will  be  represented  by 
a  piece  of  copper  (R),  acting  as  a  receptor,  which 
can  be  considered  isothermal  because  of  its  very 
high  conductivity. 

Copper  I  R 

Insulating 
material 

Source 
of  heat  ;S 

Fig  5. A  simple  thermal  setting  C*  slice**) 

A  thermometer  is  placed  on  top  of  the  piece 
of  copper,  in  order  to  measure  the  evolution  of 
the  temperature  in  function  of  the  time,  which 
remains  the  variable.  Because  of  the  copper’s 
isothermal  property,  the  distance  parameter  is 
no  longer  significant,  and  is  therefore  not  tak¬ 
en  into  account. 

The  hydraulic  flow 

In  a  hydraulic  setting  ([Hyl],  figure  6),  the 
liquid  flow  can  simulate  the  heat  flow  in  a  slice 
(the  evolutions  will  be  the  same).  The  source 


[Thil 


(a  large  beaker  containing  a  liquid)  is  connect¬ 
ed  to  a  receptor  (a  thin  beaker),  via  a  pipe  of 
very  small  diameter.  The  conductivity  of  such 
setting  is  proportional  to  the  diameter  and  length 
of  the  pipe,  and  to  the  viscosity  of  the  liquid. 
The  capacity  of  the  receptor  is  proportional  to 
the  diameter  of  the  small  beaker. 

[Hyl] 

Large 
beaker  :S 


Fig.  6.  A  simple  hydraulic  setting 
(simulating  the  heat  slice) 

The  analogy 


Figure  7  sums  up  the  simple  analogy  between  the 
simple  hydraulic  and  thermal  systems. 


H:Height ;  wrweight  (of  liquid) ;  d:diameter ;  Idength 
T:Temperature ;  Q:heat ;  C:Capacity  ;  Kxonductivity 


Fig.  7.  The  simple  analogy  (correspondences  between 
objects,  properties,  magnitudes  and  flux) 

EXPERIMENT 

The  aim  of  the  experiment  was  to  observe 
how  the  subjects  construct  a  complex  hydrau¬ 
lic  analog,  after  having  been  given  knowledge 
of  the  simple  analogy  and  the  complex  thermal 
system  to  be  simulated. 
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Subjects  and  method 

The  8  subjects  taking  part  in  the  experiment 
were  first  year  physics  students  at  university. 
They  attended  a  4  hours  practical  class,  ini¬ 
tially  devised  independently  of  our  study.  It 
consisted  mainly  in  making  experiments,  tak¬ 
ing  down  data  and  tracing  the  graphs  (this  was 
followed,  one  week  later,  by  a  theoretical  class, 
for  the  interpretation  of  the  results). 

In  order  to  introduce  a  problem  situation, 
the  psychologist  and  the  teacher  adapted  the 
pedagogical  scenario,  to : 

1  -  the  teacher  introduces  the  simple  exper¬ 
iments  [Hy  I  ]  and  [Thl  ]  as  exposed  above  (§Ex- 
pertise). 

2-  the  students  (in  pairs)  make  the  experi¬ 
ments  [Hyl]  and  [Thl]. 

3-  the  students  are  asked  to  cite  all  the  anal¬ 
ogies  between  [Hyl]  and  [Thl]  they  have  no¬ 
ticed  during  the  experiments. 

4-  the  teacher  exposes  the  simple  relevant 
analogies  (the  one  exposed  above  in  §The  anal¬ 
ogy,  plus  mathematical  ones). 

5-  the  teacher  introduces  the  problem  of 
heat  propagation  in  wood,  and  ask  the  students 
“to  find  a  hydraulic  analogy  that  simulates  the 
wood  setting  and  the  heat  evolution”. 

6-  the  students  construct  the  complex  analogy. 


Data  collecting  and  processing 

An  observer  accompanied  the  pairs  of  stu¬ 
dents.  Their  conversations  were  tape  recorded, 
their  drawings  were  collected. 

During  phases  2,  3  and  6  of  the  previous 
pedagogical  scenario,  the  observer  could  inter¬ 
act  with  the  students. 

The  transcription  of  the  audiotapes  was 
processed  in  three  steps,  leading  to  : 

-  sequences  :  units  of  discourse  (CE...) 

-  micro-units  :  sequences  relevant  extracts 

-  reading  table ;  formalized  summary  of  the 
micro-units  (ex  :  R[Hyl]  indicates  the  receptor 
of  the  simple  hydraulic  setting) . 

Results 

We  are  concerned  here  with  phase  6  of  the 
pedagogical  scenario.  We  present  a  protocol  of 
two  subjects. 


INTERPRETATION 

The  interpretation  of  the  protocol  will  be 
organized  around  the  three  main  steps  of  the 
subjects*  resolution  :  analysis,  conception,  im¬ 
provement^  . 


COMPLEX  ANA!/)GIKS  -  Subject  l>f  a  and  Gac 
Reading  tabic  VerhalhaUnm 

0Thl  Th2 :  Analysis  of  the  differences,  transformations 

Gac:  /  can  7  see  the  difference  f 

Th2  :  T  «  f(d,l)  Lea:  Because  here  [Th2].  the  temperature  depends  on  the  height  and  on  the 

Thl  :  T  =  f(t)  time,  and  with  the  copper,  as  it  goes  vco’  quickly,  it  was  only  the  temperature.  HV 

neglected  X.  (X  is  the  letter  given  by  the  student:;  to  the  distance  parameter). 

0Th  f->  Hy :  Mapping  of  the  parameters,  understanding  of  the  problem 

X[Th2]  E  HfHy2]  Lea;  It  is ...  X  would  become  H  ...on  the  other  hand.  T  was  in  fact  H  in  the 

T[Th  1 1 E  H[Hy !  ]  experiment The  temperature  was  H,  it  was  the  height 

Lea:  Therefore  it  must  depend  on  tMO  parameters. 

T[Th2]  E  H[Hy2]  Lea:  The  temperature  corresponds  to  the  height,  well,  we  keep  the  time  hut 

X[Th2]  E  ???  X,  /  donU  know  what  would  be  its  correspondence,  hum,  for  the  hydraulic  system 


0Th2  Hyl :  Comparison  of  the  evolutions  and  assimilation  of  the  objects 

Ohs:  //  R[Th2]  warms  up  little  by  little 

Lea:  but  this  R[Hy  1  ]  fills  up  hum.. . 

f(t)  [Th2]  E  f(t)  [Hyl  ]  Gac.'  little  by  little  also 

R[Hy  1  ]  E  R[Th2]  Lea:  little  by  little  also  but... 

m  RfHy2)  Gac:  In  fact,  this  R[Hyl ).  we  consider  that  it  is...  the  entire  piece  of  wood. 
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lQ[Thl]=  Q[Th2] 

New  magnitude :  p 
Principle :  Division 


OThl  Th2 ,  Th2  Thl :  Analysis  of  the  transformations 

Gae:  And  what  we  calculated  with  the  temperature  [Th  I  ]  in  fact....  was  the 

quantity  that  we  had  each  time,  tictictictic....  well  that  increased!  in  relation  to 
the  overall  volume.,  in,  well,  in  the  wood  analogy  [Th2]. 

Gae:  In  fact  it  is  the  density,  it  is  the  density  that  increases. 

Gae:  In  fact  we  need  to  devide ...  the  big  volume  by  the  amount  of  what 

arrives  each  time. 


OThl  ^  Th2  Hyl 

Principle :  Repartition 


Verification  :  reverse 
the  division  principle 


Hy2  ;  Application  of  the  Th  transformation  in  the  Hy  setting 

Lea:  You  mean  to  sdy  that  we  should ....  put  loads  of  little  re.Kervoirs  ? 

Gae:  (disregards  Lea's  proposal)  Then...  there  is  a  little  heat  that  arrives  and 

spreads  around 

Lea:  all  right,  then  we  should  make  loads  of  little  reservoir? 

Lea:  Well  [Hy  I  ]  it  is  the  same  as  the  piece  of  copper  because  when  ..  the 

water  arrives  it ..  well ..  it  can  *t  be  separated 


©bis  Thl  — >  Th2  ^  Hyl  — >  Hy2  :  Application  of  the  Th  transformation  on  a  Hy  principle 

p  =  f(h)  [Hy2]  Gae:  The  water  should.,  in  fact.,  each  time.,  it  would  be  the  same  level  but  it 

=  T  =  f(d)  [Th2]  is  only  the  density  that  changes,  this  would  be  equivalent  to  the  temperature. 


Objection 

Identification  param. 


©  Hy2 :  Setting  proposal,  justification 

Lea:  The  water  would  arrive  hum  in  a  .small  reservoir  and  we  need  this 

reservoir  to  be  filled  in  order  for  it  to  give  some  to  the  other  one 
Lea:  Because,  if  we  devide  the  wood  in  many  small  parts,  a  first  small  part 

must  be  heated  so  that  it  can  give  .some  heat  to  the  other  part. 

Lea :  The  heat  arrives  here,  it  fills  up  here  first,  then  it  goes  to  the  other  one, 

and  here  it's  OK. 

Lea:  The  problem  is  that  for  the  wood  system  it's  an  infinitesimal  quantity. 

Gae:  Therefore  we've  got  at  la.st  the  magnitude  X 


H(d)>H(d+5)  [Hy2] 
sT(d)>T(d+5)  [Th2] 


©  Construction  and  integration  of  the  propagation  law 

Obs:  But  is  there  one  that  is  completely  warm  before  it  goes  to  the  other? 

Gae:  no  it  wouldn 't  be  exactly  like  that,  it  gives  a  little  bit 

Gae:  And  in  fact,  that  is  what  the  analogy  i.s:  the  water  arrives  like  this,  we 

have  a  first  .small  beaker ..  and  Hop!  it  gives  some  to  the  other  one .. 

Lea:  It’s  filled  in  through  minute  holes  because  it  slowly  gives  out  some 

drops 

Gae:  Therefore  it  is  like  for  the  wood  analogy.,  we  don 't  need  to  wait  until  it 

fills  up  completely  for  the  water  to  go  to  the  other  one 
\  Gae:  And  the  more  it  fills  up,  the  more  it  gives 

Lea:  And  the  front-line  of  heat  is  preceded  by  a  little  added  heat,  in  fact 

that's  what  it  is! 


®Hy2  ;  Integration  of  a  property  :  Conductivity,  speed  factor 


Holes  [Hy2] 
sPipe  [Hyl] 

6Q  =  K(T(d)  -T(d+5)) 

5V  =  K(H(d)  -  H(d+5)) 


Obs:  And  here  [Hyl],  it  was  a  .slowing  down  motion  here. .. 

Lea:  In  fact  the  holes  are  .so  .small  that  that's  what  makes  it  .slow  down. 

Gae:  No  the  holes  can  be  bigger,  it  depends 

Lea:  It  .should  be  more  or  less  proportional  to  the  Wood  conductivity 

Gae:  Yes  you  bring  in  a  K  factor  in  fact 

Lea:  Yes,  because  what  would  be  perfect  would  be  to  have  a  kind  of  porous 

material  in  order  to .. 

Gae:  The  more  there  is  the  more  will  drop  and ..  a  coffee  filter,  that’s  very 

good! 


\ 


*  The  development  of  the  steps  may  seem  to  be  an 
exemplary  canonic  model!  But  we  underline  that  this  find¬ 
ing  became  obvious  after  the  cutting  out  in  sequences  fol¬ 
lowing  the  described  manner  (See  §Data  precessing). 
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0-^0:  Analysis  of  the  systems  considered  in 
the  task :  highlight  of  the  differences,  opera¬ 
tions  of  transformation,  of  mapping,  of 
assimilation 

The  subject  Lea  describes  the  difference 
between  the  simple  and  the  complex  thermal 
systems  with  regard  to  magnitudes  (temperature 
taken  at  one  point,  or  taken  depending  on  the 
distance  from  the  energy  source).  She  explains 
this  difference  by  using,  in  an  implicit  manner, 
the  properties  of  the  objects  {**the  copper,  as  it 
goes  very  quickly..**  points  out  the  very  high 
conductivity).  She  then  looks  for  a  hydraulic 
correspondence  to  the  additional  magnitude  of 
the  complex  thermal  system  (the  distance),  which 
she  will  not  identify.  Two  comments  on  this 
matter:  firstly,  from  the  A.R  point  of  view,  the 
subject  understood  that  the  problem  was  to  find 
a  magnitude  corresponding  to  the  distance;  sec¬ 
ondly,  a  kind  of  primitive  function  of  spatial 
apprehension  led  the  subject  to  think  of  making, 
in  a  first  attempt,  a  correspondence  between  the 
distance  (in  the  heat  receptor)  and  the  height  of 
the  water  (in  the  hydraulic  beaker),  probably 
because  both  are  bottom-up  directed.  For  some 
time,  this  primitive  function  is  predominant  in 
Gae’s  thinking  {**infact,  this  [the  receptor  bea¬ 
ker],  we  consider  that  it  is  the  entire  piece  of 
wood*).  This  spatial  focusing,  which  can  be 
considered  as  an  obstacle  to  the  correspondenc¬ 
es  Hy-Th,  will  play  an  important  role,  as  Gae 
will  rely  on  it  to  elaborate  the  relationship  be¬ 
tween  the  simple  and  the  complex  systems :  the 
wood  is  the  result  of  the  concatenation  of  the 
first  simple  concatenation  “copper-papef  *.  Lea 
tried  to  go  beyond  this  primitive  function  after 
taking  the  contradiction  into  account:  the  height 
of  the  water  cannot  be  a  parameter  of  a  function 
as  well  as  the  measured  value.  Thanks  to  this 
approach,  she  will  be  able  to  materially  inter¬ 
pret  the  relationship  established  by  Gae,  as  is 
shown  in  the  next  step. 

Proposal  for  a  setting  and  for  the 
evolution  of  the  phenomenon 

Lea  applies  the  concatenation  to  use  it  as 
a  transformation  of  the  simple  hydraulic  set¬ 
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ting  towards  the  complex  hydraulic  setting 
Cwe  should  put  loads  of  little  reservoirs**). 
It  is  important  to  note  here  that  at  this  mo¬ 
ment,  the  subject  doesn’t  mention  how  the 
reservoirs  must  be  linked  together.  In  giving 
up  his  spatial  apprehension  with  regard  to  the 
distance-height  equivalence  (where  water 
density  should  vary  in  function  of  its  height 
as  the  amount  of  heat  varies  with  the  dis¬ 
tance!),  Gae  will  agree  with  Lea’s  proposal. 
In  order  to  verify  the  physical  coherence  of 
the  proposed  setting,  and  to  make  precise  its 
configuration,  the  subjects  utilize  the  sup¬ 
posed  evolution  of  the  complex  phenomenon 
in  the  heat  and  hydraulic  systems  {*Uhe  heat 
arrives  here,  it  fills  up  here  ftr.%r...**).  They 
will  come  to  the  conclusion  that  their  hydrau¬ 
lic  setting  answers  the  question  brought  for¬ 
ward  in  the  previous  task  analysis  :  **There- 
fore  we*ve  got  at  last  the  magnitude  X*.  How¬ 
ever,  Lea  still  shows  signs  of  further  hesita¬ 
tion  {**the  problem  is  that,  with  the  wood  .sys¬ 
tem,  it*s  an  infinitesimal  quantity**),  and  will 
not  pick  up  on  it  herself  but  she  will  use  this 
comment  for  the  following  improvements. 

•->©:  Improvement  of  the  setting  by  in¬ 
tegrating  constraints,  and  putting  down  laws 

A  first  outline  of  a  complex  analogy  has 
been  marked,  for  which  a  certain  number  of 
elements  constitute  the  foundations  (in  par¬ 
ticular  the  concatenation  relation),  and  others 
can  now  be  removed  or  modified. 

Without  any  resistance,  the  subjects  dismiss 
the  physics  principle  that  Justified  their  first 
setting  and  modify  it,  taking  into  account  prin¬ 
ciples  that  had  not  been  previously  considered 
(*Ve  don*t  need  to  wait  until  it  fills  up  com¬ 
pletely  for  the  water  to  go  into  the  other  one** ; 
**[the  holes  are]  proportional  to  the  wood  con¬ 
ductivity**).  In  these  extracts,  it  appears  that  the 
material  analogy  of  the  settings  provides  some 
kind  of  assistance  and  some  thoughts  material¬ 
ization  in  order  to  verify  the  relevance  of  the 
proposed  principle  (a  kind  of  mcta-cognitive 
formula  could  be:  “in  order  to  validate  the  hy¬ 
pothesis  of  this  principle,  it  must  be  applied  for 
the  thermal  as  well  as  for  the  hydraulic  setting’’). 
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In  parallel  to  this  materialization,  the  sub¬ 
jects,  using  their  spontaneous  means  of  ex¬ 
pression,  refer  to  two  fundamental  laws 
which  encompass  the  overall  phenomenon  : 
the  flux  (of  heat,  water)  is  proportional  to  the 
difference  between  the  source  and  the  recep¬ 
tor  of  the  considered  magnitude  (height,  tem¬ 
perature):  *"the  more  it  fills  up,  the  more  it 
gives*\  and  the  flux  is  proportional  to  a  con¬ 
ductivity  property  which  depends  on  the  ob¬ 
jects,  and  which  can  be  measured. 

DISCUSSION 

/-  The  analogical  reasoning  brings  into  play 
various  cognitive  operations 

Mapping  and  transfer  are  generally  the  two 
main  cognitive  processes  used  to  describe  the 
A.R.  We  found  occurrences  of  these  process¬ 
es,  but  they  seem  to  be  insufficient  to  describe 
analogical  modeling. 

Also,  procedures  like  the  identification  of 
differences  (between  the  simple  and  complex 
systems  :  there  is  one  more  magnitude  to  take 
into  account ;  between  the  complex  systems  : 
the  first  hydraulic  setting  does  not  answer  the 
infinitesimal  characteristic  of  the  wood  system), 
procedures  like  systems  transformations  (in  par¬ 
ticular  by  means  of  the  concatenation  relation, 
but  also  more  generally  the  transformations  au¬ 
thorized  by  the  objects  taken  into  account  and 
their  properties),  and  procedures  like  assimila¬ 
tion,  play  a  role  that  should  not  be  disregarded. 
In  the  last-mentioned  procedure,  an  element  is 
extracted  from  the  context  of  the  system  to  which 
it  belongs  and  imported  into  the  system  in  focus 
(Gae  “sees”  the  receptor  beaker  as  the  piece  of 
wood ;  Lea  assimilates  the  heat  flux  to  her  com¬ 
plex  hydraulic  system,  using  terms  of  the  hydrau¬ 
lic  domain  ^the  heat  (..) fills  up'*)). 

Additionally,  transfer  manifests  itself  in  our 
protocol  like  a  mechanism  that  is  not  unidirec¬ 
tional.  Evolutions,  relations  and  laws  are  import¬ 
ed  from  one  domain  to  the  other  and  vice-versa. 
For  example,  the  subjects  realize  that  the  heat 
propagation  is  not  “in  stairway”  as  it  is  the  case 


for  the  water  propagation  in  the  first  complex  sys¬ 
tem  proposed.  They  then  import  the  principle  of 
“progressive  propagation”  into  the  hydraulic  set¬ 
ting  and  transform  the  setting  for  the  principle  to 
be  applied.  In  return,  the  subjects  will  import  into 
the  diermal  system  the  law  of  speed  propagation 
understood  in  the  new  hydraulic  system,  proba¬ 
bly  thanks  to  the  particularly  visual  aspect  of  the 
evolution  of  the  hydraulic  phenomenon. 

Lastly,  these  comings  and  goings,  together 
with  assimilation  operations,  seem  to  be  cog¬ 
nitively  important  for  the  construction  of  con¬ 
ceptual  and  relational  invariants. 

2-  The  analogical  reasoning  implies  repre¬ 
sentations  of  various  conceptual  registers. 

We  observe  the  presence  of  various  con¬ 
ceptual  registers  :  concepts  related  to  the  ma¬ 
terial  objects  of  the  setting  (mainly  the  kind 
and  configuration  of  the  receptor) ;  the  objects 
properties  (conductivity,  diameter  of  the  holes) 

;  the  magnitudes  representative  of  the  evolu¬ 
tion  of  the  phenomena  (the  temperature,  the 
water  level)  ;  relationships  between  objects 
(mainly  the  concatenation),  and  between  mag¬ 
nitudes  (functions,  laws). 

We  suggest  the  hypothesis  that,  in  search 
of  a  better  coherence  in  the  current  stage  of  his 
reasoning,  the  subject  confronts  these  various 
registers,  aiming  at  the  re-adjustment  of  the 
activated  knowledge. 

In  the  A.R,  this  confrontation  could  be  the 
driving  force  behind  these  comings  and  goings 
between  the  systems,  each  presenting  different 
local  and  temporary  facilities. 

3-  The  analogical  reasoning  within  a  model¬ 
ing  task  can  generate  learnings  by  way  of 
awareness  processes. 

In  some  ways,  the  laws  cited  by  the  subjects 
were  not  unknown  to  them,  as  they  were  part  of 
the  informations  delivered  by  the  teacher. 

However,  it  is  clear  that  these  laws  take  an¬ 
other  dimension  at  the  outcome  of  the  simulation 
work,  and  of  the  awareness  triggered  by  this  work. 

It  is  indeed  symptomatic  to  observe  that 
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these  laws,  which  form  the  starting  point  of  an 
expert  modeling  work,  are  cited  by  the  students 
at  the  end  of  the  protocol. 

Furthermore,  it  is  also  symptomatic  to  ob¬ 
serve  that  these  laws  were  not  expressed  by  the 
students  in  a  formal  or  canonized  manner ;  it  is 
probable  that  at  that  precise  moment,  these  “in¬ 
formal  laws”  have  not  precisely  the  status  of  a 
law  for  them...  but  they  are  ready  to  receive 
further  explicitation  from  the  teacher. 

CONCLUSION 

The  paradox  of  analogical  reasoning 

With  concern  to  our  initial  theoretical  is¬ 
sue,  the  points  developed  in  the  discussion 
throw  some  light  on  the  way  subjects  manage 
the  paradoxical  aspect  of  A.R,  in  which  they 
need  to  anticipate  “what’s  going  on”  to  con¬ 
struct  a  physical  setting,  and  at  the  same  time 
need  to  con.struct  a  setting  to  understand  “what’s 
going  on”.  It  seems  that  these  two  sides  of  the 
paradox  are  inherent  to  the  analogical  reason¬ 
ing,  and  even  may  support  the  conceptualiza¬ 
tions  of  the  phenomena.  Thus,  the  subjects 
would  construct  physics  laws  in  order  to  test 
the  coherence  “setting/evolution”,  by  creating 
a  relationship  between  the  properties  of  the  el¬ 
ements,  and  the  flux  “authorized”  by  the  prop¬ 
erties  and  the  configuration  of  these  elements. 

2~  The  study  of  analogical  reasoning  and  a 
theory  of  representations 

The  elements  of  the  discussion  together 
with  the  preceding  conclusion  lead,  in  our 


view,  to  the  idea  that  the  study  of  analogical 
reasoning  must  be  included  into  a  theory  of 
representations,  like  the  one  of  the  homomor¬ 
phism  “real-representation”  developed  by 
Vergnaud  (1987).  Indurkhya  (1992)  modeled 
the  “similarity  creating  metaphors”  in  refer¬ 
ence  to  Holland’s  model,  which  also  postu¬ 
late  this  homomorphism.  But,  in  the  one  hand 
metaphor  and  analogy  are  rather  different 
processes,  and  on  the  other  hand,  Indurkhya’s 
work  on  A.R  lakes  into  consideration  little 
psychological  data. 

Our  perspective  is  to  bring  more  elements 
in  this  direction. 
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INTRODUCTION:THE  MENTAL 
FLUIDITY 

Mental  fluidity  (or  conceptual  adaptation) 
appears  in  a  lot  ofactivities  that  are  more  or  less 
general,  like  analogy-making,  understanding 
metaphors  or  puns,  translating  or  contracting 
texts,  imagining  tobe  another  person,  counter- 
factuals-making,  human  language  error¬ 
making, humor-making,  music-playing  in  an¬ 
other  style,  words-blending,forms-recognising, 
conceptual  learning,  rolegames-playing,  pub¬ 
licity-understanding,  science-fiction,  politics, 
poetry —  and  this  list  is  not  exhaustive...  All 
of  these  activitiesrequire  the  use  of  analogies. 

We  generally  think  that  an  analogy  is  when 
the  subject  finds  the  bestmatching  between  el¬ 
ements  of  two  analogous  situations,  but  there 
is  alsothe  fact  that  we  perceive  situations  in  a 
certain  way  and  then  make  correspondences  be¬ 
tween  some  elements  of  these  situations.  We 
notknow  all  that  we  should  know  to  act  on  the 
world,  but  we  know  all  we  shouldknow  when 
we  solve  a  problem  in  an  experiment  in  which 
there  is  all  dienecessary  informations,  and  then 
we  neglict  a  part  of  the  analogical  process,  aper- 
ceptual  part  (Chalmers  et  al,  1992). 

The  capacity  to  perceive  analogies  between 
two  objects,  situations  orfacts  at  a  certain  level 
of  abstraction  is  the  general  capacity  that  ap¬ 
pears  in  activities  that  require  aconceptual  flu¬ 
idity.  This  capacity  is  very  natural:  we  can  con- 
ceptuallyadapt  ourselves  at  any  new  situation 
without  research  in  a  listing  whathappens,  and 
without  try  to  understand  what  changes  in  com¬ 


parison  of  the  time  before.  Indeed,  we  immedi¬ 
ately  sctwhat  is  the  same  —  at  a  certain  level 
ofabstraction  {i.e.,  to  see  a  thing  a^another  thing 
depending  on  pressures)  —  and  try  to  use  these 
informations  to  respond  to  the  situation’s 
problem.Consequently,  the  nature  of  analogy¬ 
making  is  seen  here  as  a  generalcognitive  pro¬ 
cess  rather  than  an  exceptional  mechanism 
brought  to  bear  only  in  unusual  circumstances, 
and  the  resolution  of  an  analogy  is  seen  as 
atranslation  (or  adaptation)  from  a  structure  to 
another  (Hofstadter,1985). 

THE  MECHANISMS:  THE  COPYCAT 
MODEL 

The  copycat's  microworld 

A  microworld  of  letters  was  created  by 
Douglas  Hofstadter  tostudy  more  rigorously  the 
mechanisms  of  the  mental  fluidity  (see  Hofs- 
tadter^r  al ,  1995).  The  microworld  is  composed 
by  the  alphabetletters  andis  a  no  circular  alpha¬ 
bet,  each  letter  having  a  knowledge  of  his  neigh- 
borletter.  The  COPYCAT  project  (Hofstadter 
&  Mitchell,  1994),  based  on  thisletter-strings 
microworld,  illustrates  this  process  with  cre¬ 
ative  analogyproblems  as  «suppose  the  letter¬ 
string  abc  were  changed  to  abd,how  would  you 
change  the  letter-string  mrrjjj  in  “thesame 
way”?^ ».  This  world  reduction  is  necessary  to 
grasp  in  a  clear  way  all  theoperations  that  are 
used  between  the  perception  of  a  creative  anal- 
ogyproblem  and  the  problem’s  resolution.  The 
analogies  are  creative  in  thesense  that  the  prob- 
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Figure  I,  Some  sotutlom  to  theirahc  »  ahcf,  mrrjjj»  ?»  problem,  depending  on  the  percefvedtranrformntinn  and  the 
tfement  of  mrrjjj  on  which  the  transformation  is  appUed. 


lem  can  have  more  than  one  coherent  solution, 
depending  on  the  perception  of  the  transforma- 
tionbetween  the  first  string  and  the  second  string 
of  the  problem  (Figurel). 

What  is  happened  in  the  transformation  of 
abc  in  abd?  The  c  is  changing?  The  third  letter 
is  changing?  The  last  letter  ischanging?  The 
higher  letter  of  the  alphabet  is  changing?  And 
then,  what  is  the  element  of  thestring  mrrjjj 
which  corresponds  to  the  c?  Is  the  last  letter  j? 
The  third  letter  r?  The  last  group  of  letter  jjj? 
But  what  is  exactly  the  transformation:  is  the  c 
(orthird  letter,  or  last  letter)  becomes  a  d,  or  be¬ 
comes  ^successor  of  c,  or  a  successor  of  the  last 
or  higherletter?  All  these  considerations  lead  to 
some  solutions,  but  the  givensolution  depends 
on  the  perception  of  the  problem  — ^for  exam¬ 
ple,  “the  last  letter  becomes  d*\  that  can  lead  tothe 
solution  mrrjjd.  One  other  response  is  mrrjjk, 
by  matching  the  abe’s  letter  c  and  the  mrrjjj’s 
last  letter  j,  and  then  applying  the  (perceived)rule 
“replace  last  letter  by  successor**.  Another  high¬ 
er  abstract  level  of  perception  is  to  consider  nei- 


'  We  note  from  now  on  a  problem  like  this:  <tiihc 
»abd,  mrrjjj »  ?». 


ther  theletters  nor  the  letter  groups  (for  exam¬ 
ple,  giving  mrrkkk),  but  the  group  length:  per¬ 
ceiving  mrrjjj  as  thelength -string  l(m)-2(r)-3(j) 
leads  to  lenght  term  responsel(m)-2(r)-40‘),  so 
toletter-string  mrrjjjj  with  the  rule  “replace 
length  of  last  group  by  successor*’.  Tlic  COPY¬ 
CAT  model(Mitchell,  1993)  is  able  to  give  a  so¬ 
lution  to  a  problem  depending  on  theperceived 
relations  and  to  give  another  solution  to  the  same 
problem  if  theperception  of  the  relationsthe  pro¬ 
gram  “have”  is  different  at  the  beginning  of  thcre- 
solution.  The  recent  extension  METACAT  de¬ 
veloping  by  Marshall  ( 1 997)seems  to  permit  the 
creation  of  rich  representations  of  the  analogies 
madein  this  microworld. 

The  architecture  of  the  model  is  based  on 
an  interaction  of  a  largenumbcr  of  perceptual 
agents  with  an  associative,  overiapping,  andcon- 
text-sensitive  network  of  concepts.  This  partic¬ 
ular  (Vivicorsi,  1996a)  per¬ 

mits  the  emergence  of  a  robust  high-level  bc- 
haviorfrom  the  interactions  of  a  great  number  of 
low-level  nondeterministicperceptual  micro¬ 
agents.  All  the  decisions  arc  probabilistic  deci¬ 
sions,  soat  every  time  the  system  can  move  to¬ 
wards  a  solution  or  another  — depending  on  all 
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the  perceived  or  constructed  features  that  are 
more  orless  leading  towards  a  specific  solution, 
with  no  determined  solution.  This  probabilistic 
dynamic  provide  to  the  model  the  capacity,  bythe 
number  of  possible  solutions  given  to  the  same 
problem,  to  appear  moreflexible  that  the  major¬ 
ity  of  cognitive  models. 

THE  MECHANISMS 

Two  mechanisms  are  proposed  and  imple¬ 
mented  in  the  model  togive  an  account  of  the 
flexibility  of  the  COPYCAT  program.  First, 
Mgh-level  perception  mechanism  is  used  to 
give  an  account  of  the  encoding  in  a  certain  way 
ofthe  problem:  we  perceive,  for  example,  that 
it’s  the  third  letter  that  ischanging  in  «abc  » 
abd»,  and  we  do  then  the  adapted  transforma¬ 
tion  on  «iiirrjjj»  ?»  to  lead  to  the  solutions 
mrdjjjormrsjjj.  Second,  Siperception-concep- 
tualization  loop  is  necessary  to  better  adapt 
ourselves  to  situations,  that  it  is  to  notseparate 
the  perception  of  the  problem  and  the  cogni¬ 
tive  implications  forthe  resolution  of  a  prob¬ 
lem:  for  example,  perceiving  the  groups  rr  and 
jjj  in  the  string  mrrjjj  entails  to  conceive  m  as 
a  group;  conceiving  m  as  a  group  of  oneletter 
entails  to  perceive  rr  as  a  group  of  two  and  jjj 
as  a  group  of  three;  perceiving  jjj  as  a  group  of 
threeletters  entails  to  conceive  4(j)  as  a  succes¬ 
sor  of3(j).  And  this  loop  can  be  generalised  from 
one  problem  to  another  one,  byimmediately 
perceiving  the  group  iiii  as  4(i)  —  and  not,  for 
example,  as  two  groups  ii  —  in  «abc  »  abd, 
iiii »  ?». 

These  two  mechanisms  seem  not  to  be  two 
specific  micro- worldmechani sms  that  only  ap¬ 
pear  in  the  COPYCAT’S  world. 

Consider  the  following  simple  example:  a 
train  goes  from  A  point  to  Bpoint,  distant  of  60 
kilometers,  at  30  km/h  speed.  A  fly  goes  from  B 
to  thetrain  and  when  it  touches  the  train,  goes 
back  to  B  ,  and  then  goes  to  thetrain,  and  return 
to  B,  etc.,  at  120  km/h  speed.  How  many  kilo¬ 
meters  the  fly  has  covered  when  thetrain  arrives 
to  B?  If  we  perceive  the  problem  as  a  distance 
problem,  the  arithmetical  operation  to  find  the 
solution  is  verydifficult,  biit  if  we  perceive  the 


problem  as  a  time  problem,  the  solution  is  evi¬ 
dent:  the  train  arrives  at  the  B  point  in  2  hours, 
so  the  fly  has  covered  120  x  2  =  240  kilometers. 
We  don’t  need  asystem  that  would  find  the  good 
representation,  but  we  must  have  the  possibility 
of  having  some  responses  influencedby  context 
and  concepts,  and  we  must  have  an  interaction 
between  theconstruction  representation  process 
and  manipulation  of  theserepresentations.  To 
perceive  a  thing  in  a  certain  way  is  something 
that  we  useeveryday:  “this  dog  is  a  caretaker”, 
for  example,  isnot  an  extraordinaiy  thought  that 
require  a  high  level  of  reflexion  (thequestion 
“Why?”  demands  this,  but  we  hdivt  already  per¬ 
ceive  the  dogflj  a  caretaker). 

In  the  same  way,  observing  a  painting  make 
us  to  think  to  somethingelse  that  is  not  in  the 
painting,  but  this  thought  can  make  us  to  per- 
ceivethe  painting  or  a  part  of  this  in  an  other 
way.  Thisperception-conceptualization  loop  is 
the  link  between  perception  and  cognition,  ig¬ 
nored  in  numerous  of  psychological  theories 
and  in  a  certain  artificialintelligence  concep¬ 
tion  of  the  cognitive  modelisation.  For  exam¬ 
ple,  theSTRUCTURE  MAPPING  ENGINE  for 
the  analogical  reasoning  (Gentner,  1989)sepa- 
rates  the  knowledge  of  the  mapping  “engine”, 
and  introduces  a  certain  format  of  representa¬ 
tions  that  permit  to  the  ENGINE  to  operate  in 
the  whishing  sense.  The  problem  is  raised  by- 
Hofstadter  (1995):  how  representations  are 
formed?  How  informations  areselected?  How 
informations  are  organized?  How  can  we  ex¬ 
plain  the  select  of  informations  that  are  notcon- 
structed  before  the  mapping?  The  problem  will 
be  raised  while  perceptionand  cognition  are 
viewed  as  two  independent  modules  (Forbus 
etal,  in  press).  Indeed,  two“modules”  can  be 
studied  separately  without  theexistence  of  the 
two  modules  (this  can  be  seen  as  an  large  high- 
levelperception  effect:  one  aspect  is  seen,  and 
after  the  other  one)  but  withthe  created  hole 
between  modules  that  has  to  be  explained^ . 


^  This  problem  is  ageneral  cognitive  science  problem: 
the  level  problem.  See  Vivicorsi  (accepted)  for  a  study  of 
the  Fodor’s  solution  (that  has  tobe  rejected)  and  ofthe  Hof- 
stadter’s  solution  (that  has  to  beconsidered). 
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Another  very  instructive  example  is  the 
BACON  model  (Langley  era/.,  1987)  as  a  mod¬ 
el  of  scientific  discovery.  It  is  able  to  discover 
theKepler’s  third  law  of  planets  movement,  but 
it  only  has  the  relevant  onesused  for  derive  this 
law  (the  average  distances  between  the  planet 
and  the  sun  and  their  period).  So,  the  system 
makes  a  selection  before  it  has  toderive  the  law, 
but  does  not  give  the  solution  in  at  less  2  years 
likeKepler^ — the  students  tested  do  this  in  one 
hour,  because  they  are  able  tofind  the  good  so¬ 
lution  with  the  good  informations  required  to 
find  it(Chalmerse/fl/.,  1992).  Where  is  the  der¬ 
ivation  of  the  law?  There  isn’t  any  need  ofin- 
foimation  selection,  high-level  perception  and 
interactions  between  whatis  perceived  and  what 
is  conceived  with  such  knowledge  apparatus. 

Finally,  when  we  categorize  objects  to 
make  a  distinction  in,  say  .three  parts.we  can 
place  the  objects  in  three  different  boxes  in  front 
of  anexperimentalist;  but  do  we  this  in  the  quo¬ 
tidian  life  when  we  are  not  in  alaboratory  with 
three  boxes  to  fill?  The  same  question  could  be 
posed  toall  experiments  in  which  the  attending 
solutions  are  a  good  one  and  a  bad  one:  to  be 
obliged  to  respond  within  astrict  scale  of  solu¬ 
tions  is  possible  and  is  not  in  contradiction  with 
themental  fluidity,  but  this  kind  of  experiment 
cannot  show  the  use  ofconceptual  adaptation. 

In  conclusion,  these  mechanisms  seem  to 
be  involved  in  all  activitiesrequiring  a  concep¬ 
tual  fluidity,  and  are  clearly  defined  in  a 
microworld(Hofstadtererfl/.,  1995)  with  more 
than  one  altenativerepresentation  (as  in  the 
simple“train”  example).  The  conceptuahUp- 
pages  (as  in  the  “dog  is  a  caretaker”example) 
are  made  on  letter,  group,  same,  opposite,  etc. 
concepts  (see  Mitchell,  1993,  for  all  details)  and 
are  explicitely  implemented  bythe  dynamic  of 
the  COPYCAT  Slipnet  (the  program  concepts 
network).  If  thesemechanisms  are  required  for 
the  conceptual  fluidity,  we  must  change  ourcon- 
ception  of  “a  concept”  to  permit  to  concepts  to 
be  integrated  in  (Vivicorsi.l  997).  So,  the  ques¬ 
tion  is:  are  they  psychologically  plausible  in 


*  13  years  according  to  Chalmers  ef  at.  (1992). 


airVcal”  activities  like  the  activities  mentioned 
at  thebeginning  of  the  paper? 

THEIR  PSYCHOLOGICAL 
RELEVANCE:  THE  COPSYCAT 
PROJECT 

The  COPSYCAT  project  (Vivicorsi,  in¬ 
preparation)  is  the  examination  of  the  psycho¬ 
logical  plausibility  of  the  mechanisms  postu¬ 
lated  in  the  COPYCATmodel  to  give  a  psycho¬ 
logical  account  of  the  conceptual  fluidity  ap- 
pearingin  numerous  activities.  The  first  exper¬ 
iments  (Vivicorsi,  1996b)  show  thatthc  micro¬ 
world  material  used  by  subjects  is  a  real  mate¬ 
rial  that  can  exhibit  the  mental  fluidity  of  sub- 
jectson  this  microworld.  The  material  used  to 
produce  creative  analogy  problemspermii  more 
than  one  solution,  so  it  permit  to  study  which 
solution  isproduced  by  subject  and  which  per¬ 
ception  permit  to  produce  it.  The  ongoing  ex¬ 
periment  presented  here  showsthe  reality  of  the 
high-level  perception  on  this  material. 

EXPERIMENT 

Forty  five  University  de  Provence  under¬ 
graduates  took  part  in  the  experiment.  For  each 
of  them,  10  problems  have  to  be  resolved,  and 
for  each  of  the  problem,  1 2  soluiionshave  to  be 
evaluated  (Figure  2),  with  computer  presenta¬ 
tion.  Clearly,  thesubject  isgiven  a  problem, 
gives  a  solution  with  no  time  limit,  and  evalu¬ 
ates  one  byone  12  solutions  for  the  running 
problem  (maybe  his  one)  with  no  timelimit, 
clicking  on  “True”  —  i.e.,  it’s  a  possible  solu¬ 
tion  to  the  problem  — ,  or  on  “False” —  i.e.,  it 
IS  not  an  acceptablesolution.  The  ‘True”  and 
“False”  propositions  varied  between  4  and  8 
for  each  problems,  so  50%  of  the  twotypes  if 
we  consider  all  the  problems.  In  sum,  ^  TT 
and  60  FF  propositionsare  presented  to  each 
subject.  We  suppose  that  all  subjects  have  the 
same  alphabet  knowledge  (it’s  a  positive  aspect 
of  a  world  reduction  without  the  negative  as¬ 
pect  of  asubject  behavior  reduction). 

Three  factors  are  manipulated  and  each 
subject  is  in  one  on  eightconditions;  the  prob¬ 
lem  can  be  presented  before  each  evaluation 
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or  not  (P);threeexamples  can  be  presented  or 
not  (E);  problems  can  be  ordained  (like  inFig- 
ure  2)  or  not  (O).  All  the  proposed  solutions 
are  randomised  for  all  thesubjects.  Conse¬ 
quently,  the  design  is  S  <  *  O2  >.  There 

is  five  subjects  for  all  conditions  but  in  P-notE- 
notO  (n=6)  and  in  notP-E-0  (n=:9).  We  will 
come  later  on  this  problem  ofsubjects,  but  re¬ 
mind  you  that  this  work  is  in  progress. 

We  register  the  solution’s  subject  to  each 
problem,  the  time  for  eachproposition’s  evalu¬ 
ation,  the  type  of  evaluation  (T/F)  in  compari¬ 
son  withthe  correct  evaluation  (T/F),  the  order 
of  appearance  of  each  propositionand  problem, 
and  the  average  time  response  of  the  subject. 

We  use  to  organize  data  the  Signal  Detec¬ 
tion  Theory  (SDT)  (Green  &  Swets,  1974)  in- 
which  it  is  possible  to  analyze  in  detail  the  pro¬ 
portion  of  the  fourpossible  cases  (Figure  3). 

This  frame  of  analyze  permit  to  use  two 
indices:  the  discrimination  index  (d’)  and  the 
decision  index(b)  (Figure  4).  According  to  this 
model,  a  subject’s  ability  to  discriminatebe- 
tween  true  items  and  false  items  is  given  by  d’, 


the  distance  between  themeans  of  the  true  and 
false  distributions  in  units  of  the  common  stan- 
darddeviation.  The  b  criterion  measures  the 
subject’s  criterion  of  decision,  that  it  isidoes 
he  prefer  raise  the  risk  to  miss  hits  (/.e.,  tore- 
spond  F  to  a  T  proposition)  or  to  be  directed 
towards  false  alarm  (Le.,  to  respond  T  to  a  F 
proposition)?  The  case  in  which  b  =  1  corre¬ 
sponds  to  chance  decision. 

These  two  indices,  on  which  means  can  be 
calculated  without  neglict  some  ofthe  global 
variations,  are  obtained  by  measuring  the  pro¬ 
portions  of  hits  (i.e.,  TT)  on  the  total  of  T  (60) 
and  the  proportions  offalse  alarm  (Le.,  TF)  on 
the  total  of  F(60)^ 


Propositions 
T  F 


hits 

false 

alarm 

miss 

correct 

reject. 

60  60 

Figure  3.  The  adapted  SDT  stimulus-response  matrix. 


1  PROBLEMS 

SOLUTIONS  “TRUE” 

SOLUTIONS“FALSE” 

ijk  kij  Ikj  jkl  bio  xwf  kjk 

yk  »  yi,  Imfgop  » ? 

Imfgoplmfgol  Imfgoq  Imfgpq 

Imfgqr  Infhoq 

Imfgqq  ijlgoq  nohiqr  kmfgop 
Imefoplmfgoz 

abc  »abd,  abbccd  »  ? 

abbddd  abbcce  abbcde  abbcef 

aababe  aaaaad  aababx  uububc 
aahahc  abbccd  babcbd  aacacd 

aabc  »  aabd,  ijkk  »  ? 

djkk  jjkk  hjkk  ikkk  ijkd  ijkl  ijdd 
ijii 

iijl  iijd  jkkk  ijik 

abcd»abcde,  mlkji »? 

mlkjie  mlkjij  mlkji  mlkjih 
nmlkji  mlkj  Ikji 

abcdi  mlkjii  fghijmljjk  abed 

abcni»  abcn,  rijk  »? 

mnn  rijn  rijl  rjkl  nijk  sijk 

rst »  rsu,  mrrjjj  »  ? 

mrrklm  mrrppp  mrrjjf  orrjjj 

mrs  »  mrt,  iiii »? 

ijkl  mrsstttt  iiim  mri 

ooe  »  o,riippp  »  ? 

ripipipp  _  _ 

eqe»  qeq,  aaabccc  »? 

qeq  bbbacbbb  baaacccb  abbbc 

cccbaaa  cacbcac  abc  bbbabbb 
qqqeppp  abcbaqaaaecccq  bacb 

Figure!.  The  problems  and  the  proposed  solutions. 


*  The  three  examples  proposed  in  four  conditions  are  not  considered,  buthaving  some  examples  before  the  test  can 
influence  the  responses  and  theevaluations  (see  next  .section). 
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HYPOTHESES,  RESULTS  AND 
DISCUSSION 

We  are  working  on  the  two  mechanisms 
supposed  to  be  involvedin  the  mental  fluidi¬ 
ty  process. 

The  high-level  perception 

Hypothesis  is  that  subjects  don’t  perceive 
all  thepossible  solutions  for  a  problem,  that  it 
is  they  don  *t perceiveall  the  relations  that  per¬ 
mit  to  produce  these  responses.  The  diagram 
(Figure  5)  shows  the  position  of  allsubjects  (S) 
within  the  space  on  which  there  is  themarhine- 
behavior  (M)  and  the  chance-behavior  (C)  ta¬ 
bles.  M  representes  a  subject  who  makes  no¬ 
error  (d*  — b  =  1).  Crepresents  a  subject 
who  responds  with  no  criterion  of  decision  (d’ 
=  0,b  =  1).  The  two  b*s  are  the  same  because 
the  two  distributions  in  eachcase  are  symet- 
ric:  in  the  case  of  M,  thedistributions  are  very 
distant  and  in  the  case  of  C  they  are  astound¬ 
ed.  S  represents  the  set  of  the  45subjects  (d*  = 
1,876  [s  =  1,205],  b  =  0,542  [s  =0,351]).  We 
shows  a  table  of  a  subject  as  an  illustration 
(d’  =  l,895,b  =  0,532). 

These  global  results  show  that  there  is  a 
selection  of  Truepropositions  that  is  not 
achance  selection.  Moreover,  the  possi- 
blestrategy  which  consists  to  give  a  response 
within  the  resolution  phase,  and  then  wait  for 
the  presentation  of  this  oneto  recognise  it  as 
True  is  rarely  observed  (the  pattern  would 


Figure  4.  iUustration  of  the  two  SDTindices  d*  and  b. 


correspond  tod’  >  3,  b  <  0,2).  The  d’  index  is 
high  enough  to  say  that  the  two  distributions 
arewell  differentiated.  The  b  <  I  shows  that 
subjects  haverathcr  judged  the  “true-lity”  of 
the  propositions. 

Then,  we  can  conclude  of  the  existence 
of  the  high-level  perception  onhis  material, 
as  it  was  defined  in  the  preview  section.  But 
we  must  gofarther  to  isolate  the  strategies  (if 
any)  of  subjects  and  to  see  if  there  is  a  differ¬ 
ent  strategy  froma  condition  to  an  other.  The 
results  by  condition  shows  only  that  in  thc- 
condition  P-notH-notO,  there  is  an  inclina¬ 
tion  to  adopt  the  strategy  mentioned  before. 
We  need  then  more  subjects  in  eachcondi- 
tion  to  analyze  the  results  on  which  calculat¬ 
ing  means  meanssomething. 

Another  important  indice  can  permit  us 
to  be  more  precise  about  the  natureof  the 
propositions  judged  True.  Indeed,  some  prop¬ 
ositions  are  seen  as  True,  but  which  are  pro¬ 
duced  by  the  subject  before  the  evaluation? 
Thchigh-level  perception  predicts  that  some 
solutions  are  judged  True,  not  all  the  possi¬ 
ble  solutions.  How  manysolutions  among  the 
True  evaluated  ones  are  perceived  in  the  res- 
olutionphase?  Our  measure  is  the  compari¬ 
son  with  the  mean  response  time  of  thesub- 
ject.  The  hypothesis  is  that  the  subject  takes 
a  little  time  to  evaluate  the  proposition  as- 
True  if  this  one  was  his  response  for  the  prob¬ 
lem.  On  all  subjects,  709^  ofthc  subjects  re¬ 
sponses  judged  True  are  given  with  a  time 
lower  that  themean  response  time  of  the  sub¬ 
ject.  We  can  then  selected  what  arc  the  solu¬ 
tions  activated  by  subjects  before  thccvalua- 
tion  test  (this  work  is  in  progress). 

THE  PERCEPTION-CONCEPTUA- 
LIZATIONLOOP 

The  hypothesis  is  that  subjects  responses  or 
evaluations  can  influenccothcr  responses  and  eval¬ 
uations.  We  must  for  this  analyze  to  reorganize 
allthe  patterns  with  respect  of  the  appearance  or¬ 
der,  and  determine  theimpli cation  of  one  response 
on  the  following  one.  This  work  is  also  in  progress 
—  we  will  use  the  Bayesianlmplicativc  Analysis 
(Bernard  A  Charron,  1996)  for  the  data  treatment. 


250 


The  Copycat  Project;  Towards  A  Conceptual  Fluidity  Theory 


Figure  5.  Global  results  with  an  example  of  a  chance-behavior  fC),  an  example  of  a  subject-behavioras  an  illustration 
of  the  set  of  subjects  (S)  and  theno-error  machine-behavior  (M). 


CONCLUSION 

The  mental  fluidity  exists,  and  we  must  take 
it  in  account  for  ourresearches,  even  if  it  is  not 
necessary  that  a  conceptual  adaptation  has  toap- 
pear.  The  central  point  is  that  in  a  psychological 
theory  or  model,  aconceptual  adaptation  could 
appear.  We  try,  with  the  COPSYCAT  project,  to 
evaluate  thepsychological  relevance  of  two  mech¬ 
anisms  proposed  by  Hofstadter  and  Mitchell  (see 
Hofstadter  et  al ,  1 995)  in  the  COPYCAT  project. 
These  mechanisms  are  seen  here  asmechanisms 
that  can  give  an  account  of  the  subjects  behavior 
when  they  are  confronted  tocreative  problems. 
The  challenge  is  to  determine  what  is  the  general¬ 
ity  ofthese  mechanisms  on  a  more  complex  world, 
without  reducting  the  subject’s  behavior. 

This  type  of  research  has  two  important 


consequences.  First,  we  ha  veto  (re)define  the 
concept  and  categoryi^vm^  —  indeed,  con¬ 
cepts  must  be  fluids  to  be  integrated  in  the 
mechanisms.  Second,  the  cognitive  modelisa- 
tionmust  be  constrainted  by  the  perception- 
cognition  loop  — the  question  is  not  how  many 
concepts**  are  activated, but  why  these  ones 
are.  The  access  to  a  conceptual  fluidity  theo¬ 
ry  is  difficult,  but  we  must  notignore  a  large 
part  of  our  activities  in  order  to  grasp  our  nat- 
uraltendency  to  slip  from  a  (micro)world  to 
another  (micro)world. 
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Although  “Reasoning  by  analogy”  is  an 
uncommon  term  for  most  people,  analogical 
reasoning  emerges  invarious  situations.  It  is 
involved  in  problem  solving  (Cauzinille-Mar- 
meche,1990  ;  Holyoak,  Junn,  &  Billman, 
1984),  in  explanation,  in  scientific  discovery, 
in  creative  thinking  and  so  on. 

Researches  on  analogy  have  been  conduct¬ 
ed  in  various  domains  such  as  Artificial  Intel¬ 
ligence,  Neural  science,  and  Psychology  fort- 
wenty  years.  These  different  approaches  have 
gathered  a  lot  of  data  which  is  useful  to  un¬ 
derstand  cognitive  processes  underlying  ana¬ 
logical  reasoning. 

The  aim  of  this  paper  is  to  introduce  the 
research  about  analogical  mapping  process  we 
have  begun  during  my  Ph  D.  First,  Ibriefly  out¬ 
line  what  is  analogical  reasoning.  Second, 
SME  and  ACME  models  will  be  expounded. 
Finally,  I  will  set  out  the  research  itself 

ANALOGICAL  REASONING 

Reasoning  by  analogy  consists  in  retriev¬ 
ing  previous  knowledge  in  order  to  understand 
what  is  unknown  or  what  is  new. Authors  agree 
with  the  idea  that  reasoning  by  analogy  plays 
an  important  role  in  knowledge  acquisition. 

It  is  also  possible  to  characterize  this  rea¬ 
soning  by  its  different  subprocesses.  Subpro¬ 
cesses  are  representation,  retrieval,  mapping, 
transferand  induction  (Keane,  Ledgeway,  & 
Duff,  1994).  In  order  to  solve  a  problemby  anal¬ 
ogy  one  must  first  represent  the  new  situation 
(target  problem)  and  ihtx^retrieve  a  useful  anal¬ 
ogous  situation  (source  or  baseproblem). 


A  core  subprocess  in  analogy  is  mapping. 
Mapping  is  necessary  for  finding  out  if  target 
and  basesituations  (or  problems)  are 
analogous.This  implies  that  one  must  construct 
coherent  one-to-one  correspondences  between 
two  situations.  If  target  and  base  situations  are 
analogous,  transfering  elements  of  knowledge 
from  one  situation  to  another  is  relevant.  A 
classical  exemple  explaining  how  mapping 
progresses  is  the  analogy  between  the  struc¬ 
ture  of  the  atom  and  the  structure  ofthe  solar 
system  (Centner  &  Landers,  1985  ;  see  also 
Centner,  Rattermann,  &  Forbus,  1993;  Ho¬ 
lyoak,  &  Koh,  1987).  The  transfer  of  a  por¬ 
tion  of  the  conceptual  structure  constitutes  the 
basis  of  analogical  inferences. 

According  to  Ripoll  (1993,  1 992),  and  un- 
likeour  sequential  presentation,  these  5  sub¬ 
processes  would  concurrently  run. 

As  we  have  mentioned  above,  analogical 
reasoning  appears  in  various  usual  activities. 
In  addition,  Cognitive  Psychology  has  been 
studying  analogical  reasoning  for  about  twelve 
years.  Researches  have  been  carried  out  in  De¬ 
velopmental  Psychology,  Cognitive  Psychol¬ 
ogy,  and  in  Artificial  Intelligence. 

In  Artificial  Intelligence,  analogy  is  use- 
fulin  two  purposes.  First,  for  researchers  who 
aim  to  understand  how  the  brain  functions, 
analogy  is  an  interesting  “mechanism”.  Sec¬ 
ond,  analogy  may  constitute  heuristic  tool 
toimprove  performances  of  expert  systems 
(Savelli,  1993).  Certain  systems  were  elabo¬ 
rated  in  order  to  simulate  fundamental  ana¬ 
logical  processes  (Gineste,  1997)  and  in  or¬ 
der  to  investigate  how  expert  systems  acquire 
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new  knowledge  (Cauzinille-Marmeche, 
Mathieu,  &  Weil-Barais,  1985). 

In  Psychology,  Piaget  proposed  a  struc¬ 
tural  stagemodel  of  analogical  reasoning.  Piag¬ 
et  and  his  colleages  argued  that  ability  to  rea¬ 
son  by  analogy  emerges  in  early  adolescence 
(Piaget,  Montangero,  &  Billeter,  1977).  Ac¬ 
cordingly,  children  would  not  be  able  to  solve 
classical  analogy  task  (  a  :  b  ::  c  :  d  )  before 
being  12  years  old,  since  they  could  not  pro¬ 
cess  abstract  relations. 

More  recently,  studies  have  provided  evi- 
dencesin  favor  of  the  notion  that  analogical 
reasoning  can  be  used  earlier  than  the  formal 
operational  period  (Goswami,  1992;Goswami 
&  Brown,  1990;  Holyoak,  Junn,  &  Billman, 
1984).  These  authors  have  shown  that  when 
children  understand  relations  which  underly 
classical  (a  :b  c  :  d)  analogies,  they  manage 
to  complete  4  terms  analogy  successfully. 

We  agree  with  this  point  of  view:  we  have- 
carried  out  a  work  about  analogical  problem¬ 
solving  with  young  children  (5  to  6  years) 
which  has  contributed  to  specifying  encod- 
ingcircumstances  thatfacilitate  retrieval  pro¬ 
cess  of  an  analogous  base  problem  (Bastien- 
Toniazzo,  Blaye,  &  Cayol,!997). 

THEORTTICAL  BACKGROUND 

My  interest  has  been  turned  towards  “the 
core”  ofanalogical  reasoning:  mapping.  The 
opinion  about  analogical  mapping  we  support 
has  become  integrated  into  researches  per¬ 
formed  by  Bastien  and  Bastien-Toniazzo 
about  context  dependence  of  knowledge. 

BavStien  argues  that  knowledge  organiza¬ 
tion  is”functionar’,  which  means  that  knowl¬ 
edge  is  structured  with  respect  togoals  to 
reach  (Bastien,  1997). 

If  analogical  reasoning  is  goal-directed- 
process  (Richard,  1990),  like  understanding, 
reasoning  and  judgment,  weassumc  that  ana¬ 
logical  mapping  process  is  also  goal -directed. 

Goal  is  a  context  feature  in  which  one  acts, 
one  thinks.  Activities  like  reading,  understand¬ 
ing,  evaluating  and  problem-solving  progress 
according  to  goal  representation  included  in¬ 


current  situation.  Accordingly,  we  assess  that 
it  is  possible  to  associate  the  concept  of’goal” 
with  the  concept  of  "internal  context”  (i.e., 
mental  context)  proposed  by  Bastien  (1197), 
because  "goal”  is  included  in  the  representa¬ 
tion  of  new  situations. 

Mapping  modets 

First  models  of  analogical  mapping  have 
attached  lot  of  importance  to  relational  struc¬ 
ture. 

Two  famous  models  have  aimed  to  simu¬ 
late  analogical  mapping.  These  model  sare  the 
Structure  Mapping  Engine  (SME;  Gentncr 
1983;  Falkenhainer,  Forbus,&  Gentner,  1986, 
1989)  and  the  Analogical  Constraint  Mapping 
Engine  (ACME;Holyoak  &  Thagard,  1989). 

Structure  Mapping  Engine 

SME  has  been  elaborated  to  simulate  map¬ 
ping  (M)process:  objects  (o)  from  the  base  (b) 
knowledge  (e.g.,  thesolar  system)  arc  placed 
in  correspondance  with  objects  (o)  from  the 
target  (0  knowledge  (e.g.,  the  structure  of  the 
atom ): 

M:b  -♦t 

t>  o 

Mapping  process  is  assumed  to  be  gov¬ 
erned  by  **Systemaricity  principle”  that  plays 
significant  part  in  SME.  Systemaficily  princi¬ 
ple  "is  a  structural  expression  of  our  tacit  pref¬ 
erence  for  coherence  and  deductive  power  in 
interpreting  analogy”  (Gentner,  1988,  p.  48). 
SME  finds  all  legal  mappings  and  then  com¬ 
bines  them  to  form  all  possible  interpretations 
for  the  comparison.  Selected  interpretations 
correspond  to  the  interpretation  with  the  best 
relational  structure. 

If  the  base  knowledge  (or  basedomain)  and 
target  knowledge  share  are  lational  structure, 
then  significant  inferences  can  be  drawn  from 
thebase  domain  in  order  to  be  transfered  from 
base  to  target  domain.  This  transfer  is  also  con¬ 
trolled  by  structural  constraints  such  asSy.^re- 
maticity  principle.  Gentner  has  argued  that 
mapping  ("analogyengine”)  is  not  influenced 
by  knowledge.  Therefore,  SME  simulates  a 
mapping  process  that  is  independent  of  domain 
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content,  goals  and  context  (Ripoll,  1993). 
However,  this  characteristic  of  SME  is  not 
compatible  with  what  it  is  acknowledged  about 
the  influenced  nature  of  human  thought 
(Keane,  Ledgeway,  &Duff,  1994). 

Analogical  Constraint  Mapping  Engine 

ACME  uses  parallel-constraint  satisfac¬ 
tion  method  to  construct  a  single,  best  in¬ 
terpretation  of  the  comparison.  This  model 
IS  an  interactive  network.  Three  Constraints 
are  implemented  in  ACME,  namely  struc¬ 
tural,  similarity  and  pragmatic  constraints. 
In  the  network,  a  node  represents  a  match 
between  two  predicates.  For  example,the 
match  between  SMART  (steve)  and  ANGRY 
(fido)  involves  nodes  representing  the 
matches  between  SMART=ANGRY  and 
steve=fido.  Nodes  are  connected  by  excita¬ 
tory  and  inhibitory  links  which  implement 
the  three  constraints.  The  network  runs  un¬ 
til  the  activations  of  nodes  settle  into  a  sta¬ 
ble  state.  The  nodes  in  which  activation  ex¬ 
ceeds  a  certain  threshold  arematches  of  the 
best  interpretation.  Mapping  difficulty  is 
measured  by  the  number  of  cycles  the  net¬ 
work  goes  through  before  reaching  the  cor- 
rect  mapping. 

This  model  has  drawn  our  attention  be- 
cause  it  was  one  of  the  first  model  to  take 
into  account  and  examine  pragmaticcon- 
straint.  Holyoak  and  Thagard  (1989)  contend 
mat  analogical  mapping  process  could  be  in¬ 
fluenced  by  pragmatic  aspects  of  the  base 
According  to  these  authors,  “pragmatic” 
term  concernselements  which  people  assess 
to  be  important  to  reach  a  goal. 

The  aim  of  my  thesis  is  to  show  that  ana¬ 
logical  mapping  is  strongly  influenced  by  prag¬ 
matic  information,  namely  thegoal. 

Our  assumption  is  that  mapping  process  be¬ 
tween  taiget(t),  knowledge  (k)  and  base  (b)  knowl¬ 
edge  progresses  according  to  the  goal  represen¬ 
tation  that  subjects  want  to  reach.Unlike  Holyoak 
and  Thagard  (1989),  we  assume  that  part  of  prag- 
niaUc  information  is  played  from  taiget  knowl- 
®ege  (t^)  and  not  from  base  knowledge  (b ): 

M ;  t,->b. 


EXPERIMENT 

Our  experiment  was  controled  by 
computerHyperCard  2.0©  software  had  been 
used  to  carry  out  this  experiment. 

Materials 

Ten  target  problems  were  composed  of 
four  termswhich  the  fourth  one  was  missing. 
We  have  changed  kind  of  terms  in  order  to- 
make  the  experiment  more  attractive.  We  have 
displayed  figures  (or  numbers),  letters  (or 
words),  geometric  shapes,  and  drawings  terms. 


Examnles!- 


Jtargetn“2:  3  9  27  ? 

|targetn«3:  Baton  Belle  Boeuf  ? 


Every  target  problem  was  matched  with 
three  base(a),  (b),  and  (c)  problems.  Base  prob¬ 
lems  were  composed  of  four  terms. 

Only  one  relation  was  included  in  target  prob¬ 
lems  where  as  two  relations  were  included  in  base 
problems.  Target  relation  waseither  belonging  to 
a  category  (eg.,  odd  number)  or  series  of  objects 
or  events  (eg.,  increasing  number). 


Examnle: 

Bases  n®  2:  (a) 

18  13 

83 

(b) 

12,3  12,1 

11,9 

11,7 

(c) 

10,2  li 

21,2 

32,2 

The  two  relations  of  base  problems  were 
either  category  and  series  or  category  and 
same  surface  or  series  and  same  surface. 

Surface”  term  means  object  properties 
shared  by  two  situations  and  which  are  irrel¬ 
evant  to  solve  a  problem:  e.g.,  colors,  shapes 
and  so  on.  However,  numerous  empirical 
findings  have  shown  that  surface  similarity 
facilitates  retrieving  process  (see,  e.g.,  Gent- 


ner  &  Landers,  1985;  Holyoak  &  Kho,1987; 
Ripoll,  1998) 

Procedure  and  task 

Target  problems  were  successively  present¬ 
ed  alone  to  the  participants.  Unlike  classical 
paradigm,  target  problems  were  shown  before 
base  problems.  The  aim  was  to  test  our  assum- 
tion  according  to  which  analogical  mapping 
would  be  governed  by  goal  representation  of 
the  target  situation.  Targets  were  displayed  dur¬ 
ing  two  seconds  and  then  were  removed  so  thM 
they  should  not  be  solved  immediately.  This 
time  limitation  allowed  however  subjects  to  en¬ 
code  terms  of  target  problems. 

Each  target  problem  was  once  more  present¬ 
ed  with  base  problems  (a),  (b)  and(c)  in  random 
order.  With  three  target-base  pair,  participants 
were  a.sked  to  asse.ss  whether  the  base  was  a  sup¬ 
port  to  solve  the  target  problem.  Participants 
clicked  with  cursor  on  yesor  no  button;  it  was 
the  mapping  task.  Mapping  times  were  record¬ 
ed  by  the  computer.  After  mapping  target  and 
bases,  subjects  gave  answers  to  solve  target  prob¬ 
lems.  Veitial  answers  were  typed  and  recorded. 


a 

b 

c 

Slim 

ves 

82 

80 

60 

222 

. . 

no 

118 

120 

140 

378 

Chi-squared  (2)  =  6.349,  p  <  .0418. 


Mapping  times 

We  predicted  that  mapping  times  should 
take  some  time.  Analogical  mapping  is  a  con¬ 
scious  process  (Ripoll,  1 992)  that  simultaneous¬ 
ly  developed  between  target  situation  and  base 
situation.Therefore.this  process  has  a  high  lev- 

cl  lime  cost. 

We  expected  participants  to  spend  more  time 
to  conclude  that  base  problemcould  be  a  sup¬ 
port  to  solve  target  problem  than  to  conclude  that 
base  problem  is  not  relevant.  This  prediction  is 
associated  >vith  mapping  pattern  predictions. 

Verbal  answers 

After  the  mapping  task,  subjects  were  asked 
to  suggest  answers  to  solve  target  problems.  Wc 
predicted  that  verbal  answers  should  be  con¬ 
sistent  with  mapping  patterns;  if  subjects  click 
on  yes  button,  then  verbal  answers  should  be 
matched  with  the  fourth  term  of  base  problem. 

First  observations 

At  this  time,  only  20  students  of  Univer¬ 
sity  of  Provence  took  part  voluntarily  in  the 
experiment. 

Mapping  patterns 

A  descriptive  analysis  shows  that  subjects 
answer  innegative  form  with  bases  (a)  and  (b). 


Table  1.  Distribution  of  responses  (yesfno),  according 
tobases  (a,  b,  c). 

EXPECTATIONS 
Mapping  patterns 

We  expected  participants  to  assess  (a)  and 
(b)bases  to  be  more  relevant  than  (c)  bases.  Ac¬ 
cord!  ngly,ye.9(y)  responses  would  be  link^  with 
(a)  and  (b)  bases  and  no  (n)  responses  with  (c): 
a  b  c 

yes  yes  no 


drawings  likebase  (a)  n“  10  which  was  balls: 


or  like  the  cable-car,  base  (b)  0*9; 
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1  2  3  4  5  6  7  8  9  1011  1213  1415  1617  18  1920 

subjects 

Figure  L  Mean  mapping  times  taken  by  each  subjects  to  assess  if  bases  are,  or  are  not  a  support  to 

finish  target  problems. 


•2  13,50 

I 

I  13,00 


no  yes 

responses 

Figure  2.  Means  mapping  times  to  clck  on  yes  button 
(relevant  base  to  solve  target  problems)  or  no  button 
(irrelevant  base). 

This  was  observed  whatever  the  kind  of  (fig¬ 
ures,  letters  etc.)  target-base  pair. 

These  results  are  different  from  our 
expectations.Only  base  problems  (c)  are  mainly 
refused  as  support  to  complete  target  problem. 

There  were  eight  possible  patterns.  Expect¬ 
ed  pattem(yy«)  is  not  the  most  frequent:  14% 
whereas  nnn  pattern  represents  29,5%. 


m... 

JM . y.9.!?. 

....my.... 

nny  nnn 

17 

28 

17  28 

3 

32 

16  59 

There  is  a  lot  of  negative  answers.  Then, 
we  have  to  think  about  the  difficulty  of  the  map¬ 
ping  task.  In  addition,  a  few  participants  said 
that  they  had  difficulties  to  understand  draw¬ 
ings  likebase  (a)  n°10  which  was  balls: 

But  subjects’  behaviour  could  be  also  im¬ 
plicated  in  this  difficulty.  We  notice  that  sub¬ 
jects  were  looking  for  too  complex  relations 
whereas  relations  contained  in  materials  were 
simple:increasing  &  decreasing;  fast  &  slow; 
quadrilateral  &  ellipse,  for  example.  More¬ 
over,  a  few  subjects  expressed  that  they  had 
removed  out  of  their  mind  simplest  relations 
because  they  thought  that  materials  were  de¬ 
signed  with  complex  relations. 

Mapping  times 

Mapping  times  show  that  mapping  process 
takes  a  long  time.  The  elapsed  mean  time  was 
13,9  seconds.  In  addition,  there  was  alarge  vari- 


a 

b 

c 

. yes 

yes 

no 

target 

coherent 

answers 

15 

18 

0 

0 

surface 

similarity 

2 

2 

0 

0 

other 

relations 

0 

3 

0 

5 

Table  2.  Distribution  of  pattern  responses. 


Table  3.  Distribution  of  verbal  answers  with  regard  the 
yyn  pattern  and  respect  to  the  categories  of  responses. 
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ability  between  subjects.  Forexempie,  subject  n®4 
took  4,46  seconds  to  click  on  yes  or  no  button  and 
subject  n®  1 8  took  33,23  seconds  (sec  figure  1). 

As  it  was  expected,  subjects  spent  more  time 
to  assess  that  base  problem  could  be  a  support 
to  solve  target  problem  than  to  conclude  that 
base  problem  is  not  relevant  (sec  figure  2). 

However,  the  overall  difference  between 
and  no  responses  is  not  significant.  At  present, 
we  analyse  more  precisely  mapping  times  of 
subjects  whose  pattern  was:  yes  yes  and  no  . 

Verbal  answers 

Vertial  answers  given  by  participants  were 
grouped  together  in  three  categories:  coherent 
answers,  answers  based  on  surface  simUarity,  and 
otherrelation.  When  we  connect  these  categories 
with  the  eight  patterns,  we  notice  that, in  general, 
subjects  have  proposed  answers  which  were  co¬ 
herent  with  their  patterns.  When  they  thought  a 
base  was  useful  to  solve  a  target  problem, they  gave 
an  answer  which  was  coherent  with  the  fourth  term 
of  base  problem.  This  result  can  alsobe  observed 
with  yyn  pattern  though  the  difficulty  of  the  map¬ 
ping  task,  and  the  time  spent  to  match  target  and 
base  problems. 

Distribution  of  verbal  answers  with  regard 
the  yyn  pattern  and  respect  to  the  categories  of 
responses 

CONCLUSION 

Intellectual  honesty  oblige  us  to  be  care¬ 
ful.  First,the  number  of  participants  is  insuffi¬ 
cient  and  we  have  to  change  few  drawings.  Sec¬ 
ond,  we  have  results  which  require  more  pre¬ 
cise  statistic  analyses.  However,  first  analyses 
of  verbal  answers  would  seem  to  indicate  that 
analogical  mapping  process  could  be  influenced 
by  the  target  problems  which  were  shown  be¬ 
fore  base  problems. 

Another  question  concerns  the  lack  of  spon¬ 
taneity  of  subjects.  People  are  too  centered  on 
finding  one  solution.  TT^is  experiment  allows 
participants  to  befree  in  their  answers.  They  are 
not  instructed  to  be  fast  and  they  areasked  to 
suggest  as  many  responses  as  possible  to  solve 
target  problems.  It  is  important  people  feel  free 


because,according  to  the  answers  proposed,  we 
are  in  position  to  study  how  subjects  perceive 
the  goal  of  the  situation  where  they  are  involved. 
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The  distributed  neural  network  that  sub¬ 
serves  analogical  reasoning  was  identified  us¬ 
ing  *’0  PET  on  12  normal,  high  intelligence 
adults.  Each  trial  presented  during  scanning 
consisted  of  a  source  picture  of  colored  geo¬ 
metric  shapes,  a  brief  delay,  and  a  target  pic¬ 
ture  of  colored  geometric  shapes.  Analogous 
pictures  did  not  share  similar  geometric 
shapes  but  did  share  the  same  system  of  ab¬ 
stract  relations.  Subjects  judged  whether  each 
source-target  pairing  was  analogous  (analo¬ 
gy  condition)  or  identical  (literal  condition). 
The  results  of  the  analogy-literal  comparison 
showed  left  hemisphere  activation  in  the  in¬ 
ferior,  middle,  and  medial  frontal  cortex,  the 
inferior  parietal  cortex,  and  the  superior  oc¬ 
cipital  cortex.  Based  on  converging  evidence 
from  neuropsychological  and  neuroimaging 
studies,  we  hypothesize  that  the  inferior  fron¬ 
tal  and  the  inferior  parietal  cortices  mediate 
analogical  mapping. 

THE  NEUROANATOMY  OF 

ANALOGICAL  REASONING 

Analogical  mapping  is  important  to  un¬ 
derstand  becau.se  it  is  a  cognitive  ability  nec¬ 
essary  for  explanation,  learning,  and  catego¬ 
rization  within  virtually  all  forms  of  dis¬ 
course.  Although  a  considerable  amount  is 
known  about  its  psychological  aspects,  ex¬ 
tremely  little  is  known  about  the  neuroana- 
tomical  basis  of  analogical  reasoning.  To 


date,  there  have  been  no  neuroimaging  inves¬ 
tigations  of  analogy  with  positron  emission 
tomography  (PET)  or  functional  magnetic 
resonance  imaging  (fMRI),  nor  any  focal  le¬ 
sion  studies.  However,  a  hypothesis  about  the 
neuroanatomical  basis  of  analogical  mapping 
can  be  made  on  the  basis  of  neuropsycholog¬ 
ical  studies  of  other  forms  of  structure-driv¬ 
en  reasoning  (e.g.,  deduction)  and  '”XE  im¬ 
aging  experiments  which  have  used  analogi¬ 
cal  materials.  On  this  basis  we  hypothesize 
that  analogical  mapping  should  be  mediated 
by  a  distributed  network  based  in  the  left  pre¬ 
frontal  cortex  and  the  left  inferior  parietal 
cortex.  We  report  the  results  of  a  PET  study 
that  supports  this  hypothesis. 

Reasoning  and  the  brain 

Because  analogy  theoretically  shares  many 
of  the  same  representations  and  processes  as 
logic  and  deduction  (Halford,  1 992),  we  can  use 
neurological  theories  of  deduction  as  a  partial 
basis  for  neurological  theories  of  analogical 
mapping.  As  reviewed  in  Wharton  and  Graf- 
man  (1998),  an  important  distinction  among 
cognitive  theories  of  deduction  is  whether  or 
not  they  focus  on  the  influence  of  socially  rel¬ 
evant  content.  Content  refers  to  statements  that 
imply  a  causation  or  social  regulation  (e.g..  If 
one  is  to  drink  alcohol,  one  must  be  over  eigh¬ 
teen).  In  contrast,  a  content  independent  state¬ 
ment  implies  relatively  little  relevant  informa¬ 
tion  (e  g.,  If  there  Is  an  A  on  one  side  of  a  card. 
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then  there  is  a  4  the  other  side). 

Clinical  and  neuroimaging  studies  appear 
to  show  that  the  left  hemisphere  conducts  rea¬ 
soning  on  the  basis  of  formal  logical  operations 
whereas  the  right  hemisphere  and  the  medial 
ventral  frontal  cortex  reason  on  the  basis  of 
experience.  In  Golding  (1981),  subjects  were 
neurological  patients  with  either  no  cerebral 
brain  lesions,  right  hemisphere  brain  lesions, 
or  left  hemisphere  brain  lesions.  These  subjects 
were  tested  with  a  version  of  the  Wason  (1966) 
selection  task.  Subjects  were  shown  cards  that 
each  had  half  of  the  top  side  masked.  The  un¬ 
masked  side  of  each  card  showed  either  a  cir¬ 
cle,  a  diamond,  a  yellow  patch,  or  a  green  patch. 
The  task  was  to  name  the  cards  that  would  need 
to  be  unmasked  to  discover  the  truth  of  the  rule, 
“whenever  there  is  a  circle  on  one  half  of  the 
card  there  is  yellow  on  the  other  half  of  the 
card.”  The  rule  would  be  falsified  if  the  other 
side  of  the  circle  card  showed  green  or  if  the 
other  side  of  the  green  card  showed  a  circle. 
Whereas  only  one  left  hemisphere  lesioned 
patient  and  no  control  patients  picked  the  cir¬ 
cle  and  green  cards,  ten  of  the  twenty  right  hemi¬ 
sphere  lesioned  patients  surprising  did  better 
and  picked  these  two  cards.  This  finding  points 
to  the  crucial  role  of  the  left  hemisphere  in  de¬ 
ductive  reasoning. 

Additional  evidence  for  the  primary  role 
of  the  left  hemisphere  in  logic  and  deduction  is 
provided  by  studies  showing  the  difficulty  that 
aphasics  (especially  with  left  posterior  lesions) 
have  in  understanding  even  the  simplest  logic 
statements.  Importantly,  these  studies  indicate 
that  right  hemisphere  lesioned  subjects  do  not 
show  general  logical  reasoning  difficulties 
(Wharton  &  Grafm an,  1^8). 

Ideally,  in  analogical  reasoning,  the  objects 
and  actions  being  mapped  are  much  less  sig¬ 
nificant  than  the  structural  relationships  be¬ 
tween  these  objects  and  actions  (e.g.,  Gentner’s 
(1983)  “systematicity,”;  Holyoak  &  Thagard’s 
(1989)  “isomorphism”).  For  example,  in  the 
Bohr  planetary  analogy  of  the  atom,  electrons 
are  mapped  to  planets,  not  because  of  any  phys¬ 
ical  or  conceptual  similarity,  but  because  both 
revolve  around  a  central  body.  Thus,  it  is  likely 


that  analogical  reasoning,  unless  concerning 
topics  with  relevant  content,  is  also  dependent 
upon  the  left  hemisphere. 

Analogy  and  the  brain 

Although  there  has  been  little  research  into 
the  neural  basis  of  analogical  reasoning,  a  num¬ 
ber  of  studies  have  used  analogical  materials 
as  a  means  of  inducing  verbal  cognitive  pro¬ 
cessing  in  subjects  (Gur  et  al,  1994;  Risberg, 
1975).  In  these  studies,  a  '^’Xe  inhalation  tech¬ 
nique  was  used  assess  subjects’  regional  cere¬ 
bral  blood  flow  (rCBF)  while  they  rested  or 
solved  four-term  verbal  analogies  (e.g.,  kite  is 
to  air  as  raft  is  to  a)  fish,  b)  swimmer,  c)  duck, 
or  d)  water).  These  studies’  hypotheses  were 
not  addressing  analogical  reasoning  per  se. 
Accordingly,  designs  were  used  that  did  not 
subtract  out  rCBF  from  cognitive  activity  not 
specific  to  analogical  mapping  (e.g.,  reading). 
Compared  to  a  resting  baseline,  subjects  solv¬ 
ing  analogy  problems  generally  show  more  ac¬ 
tivation  in  the  left  than  in  the  right  hemisphere, 
particularly  the  posterior  temporal  and  parietal 
cortices.  Gur  et  al.  (1994)  noted  that  the  ana¬ 
logical  reasoning  performance  was  significantly 
correlated  with  rCBF  detected  around  the  left 
inferior  parietal  cortex  and  so  speculated  that 
the  left  angular  gyrus  may  be  especially  central 
to  analogical  reasoning.  The  left  inferior  pari¬ 
etal  cortex  has  also  been  shown  to  be  impor¬ 
tant  to  computational  processes  related  to  anal¬ 
ogy  such  as  arithmetic  processing  (Ardilla, 
1993)  and  reasoning  with  spatial  propositions 
(Hier  et  al.,  1980).  Thus,  it  is  likely  that  the  left 
inferior  parietal  cortex  is  an  important  part  of 
the  distributed  neural  network  in  the  brain  that 
mediates  rule-based  cognitive  processes. 

’^^Xe  studies  using  analogical  materials  have 
not  shown  significant  activation  in  the  left  pre¬ 
frontal  cortex.  However,  various  researchers  have 
speculated  that  the  dorsolateral  prefrontal  cor¬ 
tex  (DLPFC)  is  specialized  for  mapping  argu¬ 
ments  to  complex  mental  representations  (Graf- 
man,  1995;  Holy  oak  &  Kroger,  1995;  Robin  & 
Holyoak,  1995).  Analogical  mapping  may  be  an 
emergent  special  case  of  this  general  propert>' 
of  the  DLPFC.  Also,  several  studies  have  report- 
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ed  that  Broca’s  aphasics  are  impaired  in  logic 
and  deduction  (Wharton  &  Grafman,  1998)  and 
a  PET  study  of  deduction  reported  that  subjects 
solving  deduction  problems  showed  left  prefron¬ 
tal  activation  (Goel  et  al.,  1997).  Finally,  given 
the  amount  of  evidence  in  support  of  the  view 
that  regions  in  the  left  prefrontal  cortex  are  re¬ 
sponsible  for  syntactic  language  processing  (Ca- 
plan,  Hildebrandt,  &  Makris,  1996)  and  the  fact 
that  analogical  mapping  strongly  resembles  a 
syntactic  process,  it  is  likely  that  the  left  pre¬ 
frontal  cortex,  as  well  as  the  left  inferior  parietal 
cortex,  mediates  analogy. 

Method  overview  and  hypothesis  predictions 

We  used  PET  with  '^O  labeled  water  to 
measure  the  rCBF  of  subjects  performing  an 
analogical  match-to-sample  task  and  a  literal 
match-to-sample  task.  The  literal  task  served 
as  a  comparison  condition  for  the  analogy  task. 

Visual  objects  were  used  as  stimuli  so  that 
a  large  number  of  nove!  analogies  could  be  cre¬ 
ated.  (Although  not  explored  as  extensively  as 
verbal  analogical  reasoning,  visual  analogical 
reasoning  has  been  studied  both  with  behav¬ 
ioral  experiments  (Gick,  1985;  Goswami, 
Brown,  Mulholland,  Pellegrino,  &  Glaser, 
1980)  and  with  computational  modeling  (Gold- 
stone,  1994;  Thagard,  Gochfeld,  &  Hardy, 
1992)).  Stimuli  consisted  of  groups  of  three  to 
five  colored,  geometric  shapes  such  as  circles 
and  stars  that  were  framed  by  a  larger  geomet¬ 
ric  shape  such  as  a  square,  circle,  rectangle,  di¬ 
amond,  or  triangle  (sec  Figures  1  and  2).  All 
objects  within  these  frames  could  be  easily  la¬ 
beled  verbally  (e.g.,  “rectangle”). 


As  shown  in  Figure  1 ,  individual  trials  con¬ 
sisted  of  the  sequential  presentation  of  a  source 
picture  (3  s  display),  a  fixation  cross  (intratrial 
delay),  a  target  picture  (3  s  display),  and  then 
another  fixation  cross  (intertrial  delay).  In  the 
anntogy  conditions,  subjects  indicated  wheth¬ 
er  the  target  picture  was  an  analog  of  the  source 
picture.  In  each  correct  trial,  the  source  and  tar¬ 
get  pictures  contained  different  objects  but 
shared  the  same  system  of  relations.  In  each 
incorrect  trial,  one  object  in  the  target  was  mis¬ 
matched  to  its  corresponding  object  in  terms  of 
its  spatial  relationship  (i.c.,  position)  or  object 
relationship  (i.e.,  shape,  texture,  or  color)  (sec 
upper  right  two  panels  of  Figure  2).  In  thclitcr- 
al  conditions,  subjects  indicated  whether  the 
target  picture  was  an  exact  match  of  the  source 
picture.  In  each  correct  trial,  the  source  and  tar¬ 
get  pictures  were  identical,  whereas  in  each  in¬ 
correct  trial,  one  object  in  the  target  was  a  dif¬ 
ferent  object  or  was  spatially  displaced  (sec 
bottom  right  two  panels  of  Figure  2). 

We  used  a  2  (Similarity:  analogical,  liter¬ 
al)  X  2  (Intratrial  Delay:  immediate,  delay)  de¬ 
sign  that  produced  four  conditions.  In  the  de¬ 
lay  analogy  and  delay  literal  conditions,  the 
intratrial  and  intertrial  delays  were  3000  ms  and 
500  ms,  respectively.  In  the  immediate  analo¬ 
gy  and  immediate  literal  conditions,  the  intra- 
trial  and  intertrial  delays  were  100  ms  and  3400 
ms,  respectively.  The  delay  and  immediate  con¬ 
ditions  were  designed  so  that  when  compared 
to  each  other,  rCBF  activation  would  be  shown 
specific  to  holding  the  mental  representations 
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Figure  1.  Example  of  stimuli  for  a  correct  trial 
sequence  in  the  anatogy  condition. 


Figure  3.  Correct  and  incorrect  triah  for  the  anatogy 
condition  (top  row)  and  the  titerat  condition 
(bottom  row). 
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of  the  source  pictures  in  working  memory.  The 
analogy  and  literal  conditions  were  designed 
so  that  when  compared  to  each  other,  rCBF 
activation  would  be  shown  for  brain  regions 
engaged  in  analogical  mapping.  Given  that  our 
materials  require  subjects  to  perceive  spatial- 
object  analogies,  it  is  relevant  to  note  Heir  et 
al.’s  (1980)  examination  of  three  semantic 
aphasics,  two  with  infarctions  of  the  left  parie- 
to-occipital  junction  and  one  with  a  bilateral 
hemorrhage  of  the  parieto-temporo-occipital 
junction.  Whereas  these  patients  could  use  ab¬ 
stract  words  such  as  crystallized,  saccharin, 
immature,  and^/ecw/ve,  they  could  not  correct¬ 
ly  follow  commands  using  spatial  prepositions 
such  as  beside,  under,  behind,  before,  or  away 
from,  nor  comprehend  simple  logico-grammat- 
ical  relationships.  Hier  et  al.  concluded  that  the 
left  temporo-parieto-occipital  region  subserves 
perception  of  spatial  relationships  (see  Farah, 
1995,  D’Esposito  et  al. ,  1997).  Thus,  we  pre¬ 
dicted  that  the  analogy-literal  comparison 
would  reveal  activation  in  the  left  inferior  pari¬ 
etal  cortex,  adjacent  areas  in  the  left  occipital 
cortex,  and  the  left  prefrontal  cortex. 

METHOD 

Subjects 

Subjects  were  6  females  and  6  males,  all 
right-handed  (mean  age  and  years  of  education, 
26  years  and  18  years,  respectively).  Subjects’ 
mean  scaled  scores  on  both  the  WAIS  (Weschler, 
1991)  vocabulary  and  block  design  subscales 
were  above  average  (13  and  12,  respectively). 

Materials  and  Apparatus 

All  source  pictures  appeared,  across  sub¬ 
jects,  in  analog  and  literal  conditions  (see  left 
column  of  Figure  2). 

For  our  stimuli,  spatial  relations  refers  to 
categorical  predicates  describing  the  relative 
spatial  positions  of  all  objects  in  a  picture  (e.g., 
diagonaljo  (blue  (ovall),  blue  (oval3)).  Ob¬ 
ject  relations  refers  to  categorical  predicates 
describing  the  relative  shape,  color,  size,  and 


texture  of  objects  to  each  other  (e.g., 
three_of_a_kind (blue  (ovall),  brown  (oval2), 
blue  (oval  3)).  The  system  of  object  and  spatial 
relations  refers  to  the  combinations  of  predi¬ 
cates  required  to  fully  describe  each  picture 
(e.g.,  three jof_a_kind  {diagonaljto  (blue 
(ovall),  blue  (oval3)),  {above  (brown  (oval2), 
blue  (oval3)),  etc.).  We  assume  that  object  and 
spatial  predicates  can  be  represented  in  verbal, 
visual,  or  both  modalities. 

The  following  factors  influenced  the  design 
of  stimuli  for  incorrect  trials: 

1.  We  wanted  subjects  to  map  each  picture’s 
system  of  object  and  spatial  relations.  For 
50%  of  incorrect  analogy  trials,  one  object 
in  the  target  picture  was  spatially  mis¬ 
matched  to  an  object  in  the  source  picture 
that  it  correctly  matched  for  color,  shape, 
and  size  relations  (see  middle  oval  in  the 
upper  middle  right  panel  of  Figure  2).  In  the 
other  incorrect  trials,  one  object  in  the  tar¬ 
get  picture  was  mismatched  in  terms  of  ob¬ 
ject  relations  to  one  object  in  the  source  pic¬ 
ture  that  it  spatially  matched  (see  triangle  in 
the  upper  far  right  panel  of  Figure  2). 

2.  Literal  trials  were  designed  to  subtract  ac¬ 
tivation  from  the  analogy  trials  in  statisti¬ 
cal  analysis.  Accordingly,  except  for  ana¬ 
logical  reasoning,  we  wanted  to  minimize 
the  differences  in  the  cognitive  processes 
that  were  used  in  performance  of  analogy 
and  literal  trials.  Incorrect //rera/ trials  were 
similar  to  incorrect  analogy  trials  in  that 
orie  object  in  the  target  picture  was  either 
mismatched  in  terms  of  its  previous  spa¬ 
tial  position  or  object  characteristics  (see 
lower  right  two  panels  of  Figure  2).  A  side 
effect  of  this  manipulation  is  that  incorrect 
literal  trials  were  likely  easier  to  detect  than 
incbrrect  analogy  trials.  An  alternative  way 
of  constructing  incorrect  literal  trials  would 
have  been  to  make  them  equivalent  in  dif¬ 
ficulty  to  incorrect  analogy  trials  by  mak¬ 
ing  relatively  subtle  object  and  spatial 
changes  between  incorrect  literal  base  and 
target  images.  However,  such  a  materials 
manipulation  would  possibly  require  sub- 
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jects  to  use  qualitatively  different  encod¬ 
ing  and  comparison  processes  in  the  literal 
condition  as  compared  to  what  they  would 
use  in  the  analogy  condition. 

PET  scans  were  performed  using  a  Scan- 
ditronix  PC2048-15B  [Uppsala.  Sweden |, 
which  collected  15  contiguous  planes  with  2 
mm  X  2  mm  x  6.5  voxels  resolution 

Procedure 

After  40  min  of  pretraining,  each  subject  v  as 
scanned  twice  in  each  condition.  All  presented 
pictures  were  seen  only  once,  and  an  equal  num¬ 
ber  of  false  and  true  trials  were  presented  in  each 


scan.  Presentation  order  of  the  four  conditions  was 
counterbalanced  across  subjects,  and  all  source- 
target  pairings  were  seen  equally  in  delay  and 
immediate  conditions.  To  control  for  neural  acti¬ 
vation  from  eye  movement,  each  picture  was  dis¬ 
played  separately  to  subjects  (see  Fig.  I ). 

Eiich  subject’s  head  was  secured  with  a 
conforming  plastic  mask  and  positioned  for 
scans  from  14  mm  to  1 1 1 .5  mm  above  the  can- 
thomeaial  line  A  transmission  scan  was  ob¬ 
tained  with  a  rotating  ^Ger'’Ga  source.  Each 
scan  resulted  from  an  intravenous  bolus  of  37 
mCi  for  a  6()  sec  period  beginning  13- 
16  s  after  bolus. 


Figure  3.  Brain  regions  activated  In  the  anatogy-literal  comparison. 
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RESULTS 

Behavioral  measures 

Mean  differences  were  tested  with  a  two- 
way  within-subjects  ANOVA.  As  compared 
to  their  performance  during  scanning  in  the 
literal  condition,  subjects*  performance  dur¬ 
ing  scanning  in  the  analogy  condi^on  was 
slower  (141 5  vs.  984  ms.;  F(l,  ll):^l 28.49, 
p  <.0001)  and  less  accurate  (respectively, 
97%  vs.  87%,F(1,  11)=  127.08,/?  <.0001). 
Subjects*  accuracy  rates  ranged  between  .81 
and  .94  in  the  analogy  condition  and  between 
.91  and  1.00  in  the  literal  condition.  For  ac¬ 
curacy  rates,  the  main  effect  of  delay  and  the 
interaction  of  similarity  by  delay  were  not 
significant  (both  F  <  1 ). 

Functional  measures 

Scans  were  realigned  to  correct  for  head 
movement,  then  normalized  to  the  Talairach 
and  Tournoux  anatomic  space  (Talairach  & 
Toumoux,  1988).  Smoothing  was  done  with 
a  20  mm  x  20  mm  x  12  mm  Gaussian  filter  to 
reduce  mismatch  due  to  anatomic  variation. 
Subject-specific  ANCOVA  was  used  to  dis¬ 
count  variations  in  overall  intensity  between 
scans.  Within-group  comparisons  of  rCBF 
were  produced  by  statistical  parametric  map¬ 
ping  (SPM95;  Wellcome  Department  of  Cog¬ 
nitive  Neurology,  London,  UK;  Frackowiak, 
&  Friston,  1994)  with  tests  of  significance 
for  the  size  of  the  activated  region  (Friston, 
Worsley,  Frackowiak,  Mazziotta,  Evans, 
1993-1994).  Regions  of  interest  (ROIs)  were 
defined  by  a  threshold  of  Z=3.09  for  each 
contrast  between  conditions.  For  each  ROI, 
statistical  probabilities  obtained  included  a 
p  value  (a  =  .05)  for  whether  the  ROI’s  peak 
intensity  difference  was  significant  and  also 
for  the  ROFs  spatial  extent  representing  the 
probability  that  the  clustered  voxels  compris¬ 
ing  the  ROI  arose  by  chance. 

Figure  3  displays  a  three  axis  SPM  plot 
of  the  analogy-literal  comparison  (left  hemi¬ 
sphere  is  left  on  transverse  and  coronal  views; 
frontal  areas  are  to  right  on  sagittal  and  trans¬ 


verse  views).  As  shown  in  Figure  3,  the  anal¬ 
ogy-literal  comparison  indicated  significant 
rCBF  activation  in  the  medial  frontal  cortex 
and  in  left  hemisphere  regions  including  the 
DLPFC  and  a  parietal-occipital  area.  Specif¬ 
ic  locations  of  local  maxima  are  shown  in 
Table  1 .  The  DLPFC  region  had  a  local  max¬ 
ima  in  the  middle  frontal  gyrus  (BA  6)  as  well 
as  other  significant  maxima  in  the  inferior 
frontal  gyrus  (BA  10,  44,  45,  46).  The  medi¬ 
al  frontal  cortex  region  had  a  local  maxima 
in  the  superior  frontal  gyrus  (BA  8).  The  pa¬ 
rietal-occipital  region  had  a  local  maxima  in 
the  inferior  parietal  lobule  (BA  40)  as  well 
as  other  significant  maxima  in  the  inferior 
parietal  lobule  (BA  7,  40),  and  the  superior 
occipital  region  (BA  19). 

Neither  the  main  effect  of  delay,  nor  the 
interaction  of  delay  and  analogy  revealed  sig¬ 
nificant  activation. 

DISCUSSION 

The  results  of  the  PET  scans  indicate  that 
relative  to  when  performing  literal  matching, 
subjects  performing  analogical  matching  utilize 
a  network  consisting  of  the  left  inferior  and  mid¬ 
dle  prefrontal  cortices,  the  medial  frontal  cor¬ 
tex,  the  left  inferior  parietal  lobule,  and  the  left 
superior  occipital  cortex.  These  results  are  note¬ 
worthy  because  they  are  the  first  to  have  come 
from  an  imaging  study  specifically  designed  to 
localize  analogical  reasoning.  Additionally,  our 
results  add  converging  evidence  to  the  idea  that 
content-independent  reasoning  is  mediated  by 
the  left  hemisphere  (Wharton  &  Grafman, 
1998).  Finally,  these  results  support  the  theo¬ 
rized  role  of  the  frontal  cortex  in  reasoning 
(Grafman,  1995;  Holyoak  &.  Kroger,  1995). 

There  are  two  alternative  explanations  for 
our  results.  First,  subjects  may  have  been 
looking  only  for  simple  spatial  “popout”  in 
the  analogy  condition.  However,  if  one 
judged  that  a  match  had  occurred  unless  one 
detected  a  spatial  mismatch,  the  maximum 
obtained  correct  rate  would  be  (1*.25  [spa¬ 
tial  mismatches]  +  0*.25  [object  mismatch¬ 
es]  +  1  *.50  [correct  matches])  =  .75.  The  low- 
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est  accuracy  rate  of  any  subject  in  the  analo¬ 
gy  condition  was  .81.  Further,  given  the  fact 
that  base  pictures  were  complex  and  novel, 
as  well  as  perceptually  different  from  their 
targets,  looking  for  popout  with  both  object 
and  spatial  relations  would  require  full  ana¬ 
logical  mapping  anyway.  Second,  because 
subjects*  activation  was  almost  solely  in  the 
left  cerebral  cortex,  the  cause  of  this  activa¬ 
tion  may  have  been  due  entirely  to  phono¬ 
logical  working  memory  (Baddeley,  1992). 
However,  experiments  have  demonstrated 
that  subjects*  performance  in  verbal  deduc¬ 
tion  problems  is  not  significantly  affected  by 
verbal  rehearsal  (Hitch  &  Baddeley,  1976; 
Gilhooly,  Logie,  Wethcrick,  &  Wynn,  1993; 
Toms,  Morris,  &  Ward,  1993). 

A  consequence  of  constructing  incorrect 
literal  trials  similar  to  analogy  trials  is  that  sub¬ 
jects  were  more  accurate  in  the  literal  condi¬ 
tion  than  in  the  analogy  condition.  Although 
we  believe  that  the  significant  activation  dif¬ 
ferences  of  the  analogy-literal  comparison  arc 
the  result  of  analogical  processing,  some  of  the 
activation  differences  may  also  reflect  the  ad¬ 
ditional  attention  needed  to  perform  in  the  anal¬ 
ogy  condition. 

Besides  activation  in  the  left  anterior  and 
posterior  regions,  the  analogy-literal  compar¬ 
isons  also  revealed  activation  in  the  dorsal  me¬ 
dial  frontal  cortex.  Research  with  monkeys  has 
shown  that  this  area  is  involved  with  spatial 
attention  processes  (Lee  &  Tehovnik,  1995). 
Thus,  dorsal  medial  frontal  activation  may 
have  been  due  to  extra  spatial  processing  re¬ 
quired  in  the  analogy  condition,  spatial  and 
object  analogical  mapping,  or  both. 

The  working  memory  comparison  (i.e., 
delay  -  immediate)  may  not  have  shown  sig¬ 
nificant  activation  because  the  process  of 
holding  stimuli  in  working  memory  for  3  s 
was  not  inherently  a  demanding  enough  task 
to  produce  significant  activation.  Alternative¬ 
ly  given  subjects*  extensive  pretraining,  sub¬ 
jects  may  have  become  so  practiced  at  keep¬ 
ing  mental  representations  of  the  stimuli  in 
mind  that  associated  brain  activations  fell 
below  detectable  levels. 


CONCLUSION 

Our  results  support  the  hypothesis  that  the 
left  prefrontal  inferior  parietal  cortices  arc  es¬ 
pecially  central  to  analogical  mapping.  Our 
findings  are  especially  important  because  they 
are  the  first  to  localize  the  crucial  cognitive  pro¬ 
cesses  required  for  analogical  mapping  to  spe¬ 
cific  brain  regions  as  well  as  demonstrating  that 
analogical  mapping  is  a  tractable  topic  for  neu- 
roimaging  investigation.  Our  results  should  pro¬ 
vide  encouragement  for  more  focused  neuroan- 
atomical  studies  of  analogical  reasoning. 
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ABSTRACT 

Language-training,  or  prior  experience  with 
arbitrary  symbols  for  the  abstract  concepts  “same 
and  different”,  appears  to  be  necessary  before 
chimpanzee  or  child  can  judge  different  pairs  of 
objects  or  patterns  to  be  analogically  the  same. 
Comparable  training  with  symbols  for  “same  and 
different”,  however,  does  not  enable  macaque 
monkeys  to  judge  the  analogical  equivalence  of 
stimulus  pairs.  Why  should  this  be?  There  is, 
after  all,  good  evidence  that  monkeys  and  pi¬ 
geons  can  judge  whether  objects  or  events  are 
the  same  on  the  basis  of  physical  identity  or 
membership  in  a  common  class  or  category.  Un¬ 
like  the  chimpanzees  and  children,  however,  nei¬ 
ther  adult  nor  infant  macaque  monkeys  sponta¬ 
neously  perceive  the  analogical  identity  of  rela- 
tions-between-relations.  These  results  support 
the  hypothesis  that  representational  re-coding  of 
abstract  relations  via  symbols  enable  child  and 
chimpanzee  to  explicitly  express  that  which  they, 
if  not  monkeys,  perceive  implicitly  early  in  life. 

Analogical  Judgments  of  Similarity  are  a 
hallmark  of  human  reasoning  and  intelligence 
(Spearman,  1923;  Sternberg,  1977).  Similarity 
judgments  can  be  based  solely  on  physical  iden¬ 
tity  or  the  degree  of  resemblance  between  cate¬ 
gorical  attributes.  Analogies,  however,  entail 
judgments  about  the  equivalence  of  higher-or¬ 
der  relational  structures  and  representations  that 
need  not  physically  resemble  one  another  (Cen¬ 
tner  &  Markman,  1997;  Goswami,  1991;  Ho- 
lyoak  &  Thagard,  1997). 


Recent  research  indicates  that  early  in  life 
humans  and  chimpanzees  have  perceptual  and 
cognitive  precursors  for  the  development  of 
higher  level  analogical  information  process¬ 
ing  abilities  that  are  not  shared  by  adult  or  in¬ 
fant  macaque  monkeys  (Thompson,  1995;  Th¬ 
ompson  &  Oden,  1996).  Furthermore,  some 
form  of  re-coding  via  language  or  analogous 
symbolic  systems  catalyses  the  explicit  expres¬ 
sion  of  these  implicit  competencies  in  prob¬ 
lem  solving  tasks  involving  analogical  reason¬ 
ing  by  both  natural  and  artificial  learning  sys¬ 
tems  (Thompson,  Oden,  &  Boysen,  1997; 
Clark  &  Thornton,  1997). 

IMPLICIT  AND  EXPLICIT  KNOWLEDGE 
ABOUT  ANALOGICAL  RELATIONS  IN 
CfflMPANZEES  AND  CHILDREN. 

Language-naive  chimpanzees  and  pre-lin- 
guistic  human  infants  perceive  relations  (iden¬ 
tity  or  nonidentity)  to  be  the  same  or  different 
as  measured  by  either  visual  gaze  or  object 
handling  in  preference-for-novelty  tasks  like 
‘paired-comparison’  and  ‘habituation/disha- 
bituation*.  However,  both  non-  orpre-linguis- 
tic  species  fail  to  explicitly  judge  the  analogi¬ 
cal  equivalence  of  one  identity  relation  (AA) 
with  another  identity  relation  (BB),  and  one 
nonidentity  relation  (CD)  with  another  (EF) 
(Oden,  Thompson,  &  Premack,  1990;  Tyrrell, 
Stauffer  &  Snowman,  1991;  Tyrrell,  Zingaro, 
&  Minard,  1993).  Note  that  in  this  example, 
and  for  the  remainder  of  this  paper  letters  (e.g., 
A  A  &  CD)  are  used  only  for  expository  pur- 
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poses  in  lieu  of  the  actual  or  digitized  stimu¬ 
lus  objects  employed. 

Only  those  humans  and  chimpanzees  ex¬ 
posed  to  a  regime  of  language  or  symbolic  to¬ 
ken  training  can  judge  abstract  relations-bc- 
tween-  relations  as  being  the  same  or  different 
(House,  Brown  &  Scott,  1974;  Prcmack  1978; 
1983a,  1983b;  Thompson,  Oden  &  Boysen, 
1997).  For  example,  this  capacity  is  revealed 
in  conceptual  matching-to-sample  tasks.  In  this 
problem  a  chimpanzee  or  child  is  correct  if  they 
match  a  pair  of  shoes  with  a  pair  of  apples,  rath¬ 
er  than  to  a  paired  eraser  and  padlock.  Like¬ 
wise,  they  are  correct  if  they  match  the  latter 
nonidentical  pair  with  a  paired  cup  and  paper¬ 
weight.  The  conceptual  matching-to-sample 
task  can  be  conceived  of  as  a  nonlinguistic  anal¬ 
ogy  problem  involving  a  single  abstract  rela¬ 
tionship  of  same  or  different.  Gillan,  Prcmack 
&  Woodruff  (1981)  demonstrated  that  a  lan¬ 
guage  trained  chimpanzee  -  Sarah  -  who 
matched  conceptually  also  succeeded  in  com¬ 
pleting  partially  constructed  analogies  involv¬ 
ing  complex  geometric  forms  and  functional 
relationships.  More  recently,  Oden,  Thompson 
&  Premack  (in  preparation)  further  demonstrat¬ 
ed  that  this  same  chimpanzee  could  not  only 
complete,  but  also  construct,  analogies  sponta¬ 
neously  from  a  randomized  grouping  of  geo¬ 
metric  elements. 

These  findings  imply  that  language  or  sym¬ 
bol  training  does  not  instill  propositional  knowl¬ 
edge  about  abstract  relations  of  the  type  de¬ 
scribed  above,  but  it  does  appear  necessary  for 
the  explicit  expression  of  such  knowledge  in 
equivalence  judgment  tasks.  The  implication 
then  is  that  experience  with  external  symbol 
structures  and  experience  using  them  trans¬ 
forms  the  shape  of  the  computational  spaces 
that  must  be  negotiated  in  order  to  solve  cer¬ 
tain  kinds  of  abstract  problems.  This  finding 
dovetails  with  the  independent  demonstration 
by  Clark  and  Thornton  (1997)  that  standard 
connectionist  learning  by  artificial  intelligent 
systems  runs  aground  in  exactly  the  same  class 
of  tasks  used  with  the  child  and  chimpanzee, 
unless  the  net  is  provided  with  some  external 
means  of  reducing  the  search  space. 


MONKEYS  DEMONSTRATE  NEITHER 
IMPLICIT  NOR  EXPLICIT  KNOWLEDGE 

ABOUT  ANALOGICAL  RELATIONS. 

The  provision  of  such  ‘external  means*  via 
symbol  training  with  tokens  docs  not  enable 
macaque  monkeys  to  judge  the  analogical 
equivalence  of  stimulus  pairs  (W^nshbum,  Th¬ 
ompson  &  Oden,  1997;  ms.  in  preparation). 
“Symbol”  sophisticated  monkeys  were  trained 
to  choose  “Circle”  following  an  identity  pair 
(AA — O)  and  to  choose  “Triangle”  following 
a  nonidentity  pair  (CD — /_\).  Then  they  gener¬ 
alized  this  ability  to  novel  identity  (BB)  and 
nonidentity  (EF)  stimulus  pairs.  Nevertheless, 
as  shown  in  figure  I,  unlike  chimpanzees  with 
the  same  experience  (Thompson,  Oden  &  Boy- 
sen,  1997),  the  monkeys  still  failed  to  match 
AA  with  BB  and  CD  with  EF  above  chance  lev¬ 
els  despite  their  success  on  physical  matching 
problems.  Why  should  this  be?  Thompson  & 
Oden  (1996)  demonstrated  that  contrary  to  ape 
and  child,  adult  macaque  monkeys  are  percep¬ 
tually  insensitive  to  analogical  equivalencies  of 
a  propositional  nature.  Hence,  the  circle  and 
triangle  tokens  could  not  acquire  symbolic 
meaning  as  was  the  case  for  chimpanzees.  In¬ 
stead  the  circle  and  triangle  token  were  restrict¬ 
ed  to  functioning  simply  as  choice  alternatives 
signaled  by  the  preceding  physical  equivalence 
judgment  that  *A  is  A’  or  ‘C  is  not  D’ 

Adult  rhesus  macaque  monkeys  do  not 
spontaneously  perceive  analogical  or  relation¬ 
al  identity  when  tested  using  the  same  prefer¬ 
ence  for  novelty  procedures  employed  with  the 
chimpanzees  and  human  infants  (Thompson, 
Oden,  &  Gunderson,  1997).  Thus  far,  this  dis¬ 
parity  holds  true  regardless  of  the  task  (paired- 
comparison  &  habituation/dishabituation)  and 
hence  time  available  for  information  process¬ 
ing,  or  whether  visual  gaze  or  object  handling 
is  the  dependent  measure  (Chaudhri,  Ghazi, 
Thompson  &  Oden,  1997;  Thompson,  1995; 
Thompson  &  Oden,  1996;  Thompson,  Oden, 
Boyer,  Coleman,  &  Hill,  1997).  Nevertheless, 
regardless  of  the  dependent  measure,  the  same 
animals  give  every  indication  that  they  perceive 
objects  to  be  the  same  or  different  based  on 
physical  properties  alone. 
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Chimpanzees  Monkeys 

Figure  1.  Percent  correct  performances  for  physical  (Le. 

object)  and  conceptual  (i.e.,  analogical  relations- 
between-relations)  in  matcHing-to-sample  (MTS)  tasks 
by  chimpanzees  and  macaque  monkeys  previously 
trained  with  symbols  for  **same**  and  ^different**. 
Data  for  chimpanzees  derived  from  Thompson, 
Oden  &  Boysen  (1997). 

Data  for  monkeys  derived  from  Washburn, 

Oden,  &  Thompson  (1997). 

Recent  data  collected  from  infant  macaques 
further  indicate  that  these  results  are  not  sim¬ 
ply  a  function  of  age  (Maninger,  Gunderson,  & 
Thompson,  1997).  As  shown  in  figure  2,  7- 
week-old  pigtailed  macaque  infants,  like  the 
adiilt  macaques,  but  in  contrast  to  their  human 
counterparts,  fail  to  recognize  abstract  relations 
on  a  visual  paired-comparison  measure.  This 


s 

8 


100-1 

■  Physical  Condition 

go  .  1^  Conceptual  Condition 


Human  Infants  Infant  Macaques  Aged  Macaques 


Figure  2.  Percent  preferences  for  physical/object  and 
conceptual/relational  novelty  in  visual  paired- 
comparison  tasks.  Data  for  human  infants  derived  from 
Tyrrell  et  al,  (1991).  Data  for  macaque  monkey  infants 
and  adults  derived  from,  respectively,  Maninger, 
Gunderson,  &  Thompson  (1997),  and  Thompson,  Oden, 
&  Gunderson,  (1997). 


is  the  first  study  using  the  familiarity-novelty 
paradigm  in  Gunderson’s  laboratory  that  has 
shown  a  discontinuity  in  perceptual -cognitive 
development  between  macaque  and  hurnan  in¬ 
fants  (Grant-Webster,  Gunderson  &  Burbach- 
er,  1990;  Gunderson,  Rose  &  Grant-Webster, 
1990;  Sackett,  Gunderson  &  Baldwin,  1982). 

CONCLUSIONS 

Taken  together  all  the  above  findings  im¬ 
ply  that  analogical  reasoning  in  natural,  and 
possibly  artificial,  agents  cannot  emerge  from 
a  tabula  rasa.  Rather,  as  suggested  also  by 
Clark  and  Thornton’s  work  (1997),  the  facili- 
tative  effects  of  language  and  symbol  training 
on  analogical  reasoning  can  only  operate  upon 
pre-existing  perceptual  competencies.  This  re¬ 
structuring  of  input/output  spaces  permits  the 
establishment  of  new  similarity  or  neighbor¬ 
hood  relations  between  stimuli. 
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INTRODUCTION 

The  ability  to  use  relational  similarity  is 
considered  a  hallmark  of  sophisticated  think¬ 
ing;  it  plays  a  role  in  theories  of  categoriza¬ 
tion,  inference,  transfer  of  learning  and  gener¬ 
alization  (Centner  &  Markman,  1997;  Halford, 
1993;Holyoak&Thagard,  1995;Novick,  1988; 
Ross,  1989).  However,  young  children  often 
fail  to  notice  or  use  relational  similarity  (Cen¬ 
tner,  1988;  Centner  &  Rattermann,  1991  ;Cos- 
wami,  1993;  Halford,  1993).  For  example, 
when  given  the  metaphor  “plant  stems  are  like 
drinking  straws”  5-year-old  children  focus  on 
the  common  object  similarities,  commenting 
that  “They  are  both  long  and  thin,”  whereas  9- 
year-olds  focus  on  the  relational  commonality 
that  “They  both  carry  water”  (Centner,  1988). 

This  relational  shift  in  children’s  use  of 
similarity — a  shift  from  early  attention  to  com¬ 
mon  object  properties  to  later  attention  to  com¬ 
mon  relational  structure — has  been  noted 
across  many  different  tasks  and  domains  (Cen¬ 
tner  &  Rattermann,  1991 ;  Halford,  1993).  For 
instance,  Centner  and  Toupin  (1986)  present¬ 
ed  children  with  a  story  mapping  task  in  which 
object  similarity  and  relational  similarity  were 
cross-mapped:  that  is,  similar  objects  were 
placed  in  different  relational  roles  in  the  two 
scenarios,  so  that  the  plot-preserving  relation¬ 
al  correspondences  were  incompatible  with 
obvious  object-based  correspondences.  Under 
these  conflict  conditions,  6-yenr-old  children 


were  unable  to  preserve  the  plot  structure  in 
their  mapping,  although  they  could  transfer  the 
story  plot  accurately  when  given  similar  char¬ 
acters  in  similar  roles.  Older  children  (9-years- 
old)  could  maintain  a  focus  on  the  relational 
structure  and  transfer  the  plot  accurately  despite 
competing  object  matches.  There  is  evidence 
that  this  shift  from  objects  to  relations  is  based 
on  gains  in  knowledge  (Brown,  1989;  Goswa- 
mi,  1993;  Kotovsky  &  Centner,  1996;  Ratter¬ 
mann  &  Centner,  in  press),  although  matura- 
tional  changes  may  also  play  a  role  (Halford, 
Wilson,  Guo,  Gayler,  Wiles  &  Stewart,  1995). 

Children’s  ability  to  carry  out  purely  rela¬ 
tional  comparisons  improves  markedly  across 
development.  Yet  even  very  young  children  can 
reason  analogically  under  some  circumstances 
(Crisafi  &  Brown,  1986;  Kotovsky  and  Cent¬ 
ner,  1996).  For  example,  Centner  (1977)  dem¬ 
onstrated  that  preschool  children  can  perform 
a  spatial  analogy  between  the  familiar  base 
domain  of  the  human  body  and  simple  pictured 
objects,  such  as  trees  and  mountains.  When 
asked,  “If  the  tree  had  a  knee,  where  would  it 
be?,”  even  4-year-olds  (as  well  as  6-  and  8-year- 
olds)  were  as  accurate  as  adults  in  performing 
the  mapping  of  the  human  body  to  a  pictured 
object,  even  when  the  orientation  of  the  tree 
was  changed  or  when  confusing  surface  at¬ 
tributes  were  added  to  the  pictures. 

What  factors  impede  or  promote  the  per¬ 
ception  of  common  relational  structure?  Ac¬ 
cording  to  structure-mapping  theory  (Centner, 
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1983,  1989;  Centner  &  Markman,  1997)  an 
analogy  is  the  mapping  of  knowledge  from  one 
domain  (the  base)  to  another  domain  (the  tar¬ 
get)  in  which  the  system  of  relations  that  holds 
among  the  base  objects  also  holds  among  the 
target  objects.  When  adults  interpret  an  anal¬ 
ogy,  the  correspondences  between  base  and 
target  objects  are  based  on  common  roles  in 
the  matching  relational  structures;  the  corre¬ 
sponding  objects  in  the  base  and  target  do  not 
have  to  resemble  each  other.  However,  al¬ 
though  the  final  interpretation  of  an  analogy 
is  determined  by  relational  similarity  rather 
than  by  object  similarity,  we  hypothesize  that 
in  the  actual  process  of  computing  an  analogy 
both  object  similarity  and  relational  similari¬ 
ty  are  at  work  (Falkenhainer,  Forbus,  &  Cen¬ 
tner,  1990;  Halford,  Wilson,  Cuo,  Cayler, 
Wiles  &  Stewart,  1995;  Holy  oak  &  Thagard, 
1989;  Hummel  &  Holyoak,  1997;  Keane  & 
Brayshaw,  1988). 

A  natural  consequence  of  the  structure-map- 
ping  view  is  that  knowledge  of  relations  plays  a 
crucial  role  in  the  mapping  process;  if  the  child 
(or  adult)  has  not  represented  the  relations  that 
hold  within  the  domain  then  the  matches  formed 
will  be  based  upon  common  object  similarity 
rather  than  common  relational  similarity.  Thus 
as  domain  knowledge  increases,  so  does  the  like¬ 
lihood  that  the  child’s  comparisons  will  be  based 
on  commori  relational  structure. 

In  summary,  the  ability  to  use  relational 
similarity  is  sensitive  to  changes  in  the  chil  d’s 
knowledge  base.  With  increasing  knowledge  of 
the  relationships  in  a  domain,  children  become 
more  able  to  understand  and  produce  purely 
relational  matches.  This  brings  us  to  the  issue 
of  interactions  between  language  and  thought. 

Language  and  Relational  Similarity 

We  propose  that  language  may  interact 
with  the  development  of  analogical  ability 
by  serving  as  an  invitation  to  seek  likeness — 
to  make  comparisons.  The  word-learning 
studies  of  Markman,  Waxman,  and  others 
have  shown  that  when  children  are  taught  a 
new  object  term  they  assume  very  that  the 
word  applies  to  things  of  like  kind  (Imai, 


Centner  &  Uchida,  1994;  Markman,  1989; 
Waxman  &  Gelman,  1986).  However,  this 
work  has  focused  on  noun  learning.  We  pro¬ 
pose  that  the  acquisition  of  relational  lan¬ 
guage  promotes  the  development  of  analo¬ 
gy  by  inviting  children  to  notice  and  repre¬ 
sent  higher-order  relational  structure  (Cen¬ 
tner  &  Medina,  1997).  So  far,  the  evidence 
on  this  issue  is  rather  scant,  although  Ko- 
tovsky  and  Centner  (1996)  found  that  4- 
year-olds  were  better  able  to  perceive  cross¬ 
dimensional  perceptual  matches — e.g.,  sym¬ 
metry  of  size  compared  to  symrhetry  of 
shading — when  they  had  previously  been 
taught  a  relational  label — “even” — -to  iden¬ 
tify  the  relation  of  symmetry. 

The  Present  Studies 

In  these  experiments  we  tested  whether 
children’s  relational  performance  can  be  im¬ 
proved  by  the  introduction  of  relational  labels. 
The  basic  task  used  in  these  experiments  was  a 
cross-mapping  search  task  in  which  object  sim¬ 
ilarity  and  relational  similarity  were  in  conflict 
so  that  a  response  based  on  one  type  of  similar¬ 
ity  precluded  a  response  based  on  the  other.  We 
chose  the  higher-order  relation  of  monotonic 
change  in  size  across  position.  This  relation  has 
the  advantage  that  it  can  be  understood  on  the 
basis  of  perceptual  informa^tion  available  to  the 
child  (in  contrast  to  some  causal  or  social  high¬ 
er-order  relations  that  may  require  abstract 
knowledge).  This  cross-mapping  task  is  used 
in  Experiment  1,  whose  results  of  serve  as  a 
baseline  level  of  performance.  In  Experiment 
2,  we  gave  children  the  relational  labels  “Dad¬ 
dy/Mommy/Baby”  and  found  a  predicted  gain 
in  performance.  In  Experiment  3  we  tested  other 
sets  of  relational  labels,  and  further,  tested  for 
long-term  effects  of  labeling. 

EXPERIMENT! 

Participants 

The  participants  were  24  3-year-olds,  24 
4-year-olds  ,  and  16  5-year-olds. 
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Procedure 

Children  were  asked  to  map  monotonic 
change  in  size  between  a  triad  of  objects  belong¬ 
ing  to  the  experimenter  and  a  triad  of  objects 
belonging  to  the  child.  A  cross-mapping  was  cre¬ 
ated  by  staggering  the  sizes  of  the  objects,  as  il¬ 
lustrated  in  the  following  diagram  in  which  the 
objects,  represented  by  numbers,  form  monoton¬ 
ic  change  in  size  from  left  to  right. 

E  3  2  1 

C  4  2  2 

The  experimenter  and  the  child  sat  across 
from  each  other  with  the  stimulus  sets  in  two 
arrays  separated  by  about  6  inches,  forming  an 
arc  in  front  of  the  child.  The  child  closed  his 
eyes  and  the  experimenter  hid  a  sticker  under¬ 
neath  one  of  his  toys,  as  she  explained,  “I’m 
going  to  hide  my  sticker  underneath  one  of  my 
toys  while  you  watch  me.  If  you  watch  me  care¬ 
fully,  and  think  about  where  I  hid  my  sticker, 
you’ll  be  able  to  find  your  sticker  underneath 
one  of  your  toys.”  She  then  placed  her  sticker 
under  a  toy  in  her  set  and  said  “If  I  put  my  sticker 
under  this  toy,  where  do  you  think  yours  is?”. 
The  child  was  then  allowed  to  guess,  but  kept 
the  sticker  only  if  he  found  it  on  his  first  guess. 

Using  a  relative  size  rule,  if  the  experiment¬ 
er  chose  object  2  in  her  set  the  correct  choice  is 
the  child’s  object  3  (Notice  that  the  child  must 
resist  an  object  match  between  the  experiment¬ 
er’s  object  2  and  his  object  2.).  Thus,  object  sim¬ 
ilarity  was  put  in  conflict  with  relational  simi¬ 
larity  (in  the  form  of  monotonic  increase)  form¬ 
ing  a  task  in  which  a  response  based  on  cither 
similarity  type  is  possible,  but  only  a  response 
based  on  relational  similarity  is  correct.  The  chil¬ 
dren  performed  14  cross-mapped  trials. 

Materials 

We  designed  rich  stimulus  sets  that  contained 
interesting,  rich  objects  that  varied  along  all  di¬ 
mensions,  including  size,  within  the  two  sets  (e  g., 
a  red  flower  in  a  pot,  a  wooden  house,  a  green 
mug,  and  a  race  car)  and  sparse  stimulus  sets  that 
contained  very  simple,  sparse  objects  that  were 
identical  in  all  respects  but  size  within  the  two 
sets  (e.g.,  clay  flower  pots).  (See  Figure  1 .). 


Based  on  structure-mapping  theory,  we 
predicted  that  the  rich  object  matches  would 
compete  strongly  with  the  relational  mapping 
rule.  In  contrast,  the  sparse  object  matches 
would  be  relatively  easy  to  overcome— children 
would  be  able  to  perform  the  relational  map¬ 
ping  despite  a  common  object  identity  choice. 
A  related  prediction  was  that  the  children  would 
make  significantly  more  object-identity  re¬ 
sponses  when  object  richness  was  high  than 
when  object  richness  was  low. 

Results  and  Discussion 

The  children’s  correct  relational  responses 
revealed  both  the  predicted  effect  of  object  rich¬ 
ness  and  the  relational  shift  in  analogical  per¬ 
formance.  The  richness  effect  led  the  children 
to  produce  significantly  more  relational  re¬ 
sponses  with  the  sparse  stimulus  objects  (54% 
for  the  3-year-olds,  62%  for  the  4-year-olds,  and 
95%  for  the  5-year-olds)  than  with  the  rich  stim¬ 
ulus  objects  (32%  for  the  3-year-olds,  38%  for 
the  4-year-olds,  and  68%  for  the  5-year-olds), 
suggesting  that  the  presence  of  rich,  distinctive 
object  matches  created  a  salient  alternative  to 
the  relational  response  (at  least  for  young  chil¬ 
dren).  In  contrast,  when  sparse  objects  were 
used,  the  object  similarity  matches  were  less 
compelling  and  therefore  less  likely  to  act  as  a 
competitive  alternative  to  the  relational  re¬ 
sponse.  As  further  evidence  for  the  effect  of 
object  richness,  the  number  of  object  identity 
errors  significantly  increased  with  the  use  of 
the  rich  stimuli  (33%  for  the  rich  versus  17% 
for  the  sparse,  collapsed  across  all  three  age 
groups).  The  relational  shift  was  found  in  the 
children’s  overall  performance,  with  the  5-year- 
olds  performing  significantly  better  than  3-  and 
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4-year  olds,  who  achieved  above  chance  per¬ 
formance  only  with  the  sparse  stimuli. 

In  Experiment  2  we  tested  whether  a  set  of 
relational  labels  that  provide  children  with  an 
explicit  relational  structure  could  help  them 
carry  out  a  relational  comparison  and  mapping. 
We  introduced  a  group  of  3-year-olds  to  the  use 
of  the  labels  “Daddy/Mommy/Baby”  to  de¬ 
scribe  the  relationship  of  monotonic  change, 
and  then  presented  them  with  the  cross-map- 
ping  task  of  Experiment  1.  We  hypothesized 
that  these  labels  would  provide  the  children  with 
an  explicit  framework  for  the  relational  system 
of  monotonic  change  in  size.  If  so,  then  the  chil¬ 
dren’s  ability  to  perform  a  relational  mapping 
with  both  the  rich  and  the  sparse  stimuli  should 
improve  with  the  use  of  these  labels.  To  obtain 
the  maximal  effect  of  labeling,  we  went  to  great 
lengths  to  ensure  that  the  children  were  famil¬ 
iar  with  the  relational  use  of  the  family  labels, 
training  participants  with  the  “Daddy/Mommy/ 
Baby”  labels  prior  to  presenting  them  with  the 
cross-mapping  task.  We  also  reminded  the  chil¬ 
dren  of  these  labels  on  each  trial  during  the 
course  of  the  experiment. 


EXPERIMENT  2 
Participants 

The  participants  were  24  3-year-olds. 

Procedure 

Label-training.  The  label  training  stimuli 
were  a  set  of  toy  penguins  and  a  set  of  teddy 
bears,  each  of  four  different  sizes  and  with  very 
different  markings.  Training  consisted  of  two 
sets  of  four  trials  in  which  the  cross-mapping 
between  objects  and  relations  did  not  hold 
(bears  were  mapped  to  penguins)  and  two  sets 
of  four  cross-mapped  trials  (penguins  were 
mapped  to  penguins).  The  following  protocol 
was  used  for  the  first  eight  trials: 

“These  bears  and  these  penguins  are  each 
a  family.  In  your  bear  family,  this  is  the  Daddy 
(pointing  to  the  larger  bear)  and  this  is  the  Mom¬ 
my  (pointing  to  the  smaller  bear).  In  my  pen¬ 
guin  family  this  is  the  Daddy  and  this  is  the 
Mommy  (again  pointing  appropriately).”  When 
the  child  successfully  labeled  the  animals  in 
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both  sets,  the  experimenter  said  "If  I  put  my 
sticker  under  my  Daddy  penguin,  your  sticker 
is  under  your  Daddy  bear.  Look,  my  sticker  is 
under  my  Daddy,  Where  do  you  think  your 
sticker  is?”  and  the  child  was  allowed  to  search 
for  the  sticker,  again  only  keeping  it  if  he  found 
it  on  his  first  guess.  After  four  trials  a  third, 
smaller,  stuffed  animal  was  added  to  each  set 
and  the  labels  "Daddy/Mommy/Baby”  were 
applied  in  the  manner  described  above.  The 
same  protocol  was  adapted  for  use  in  the  cross- 
mapped  penguin/penguin  trials. 

Cross-mapping  trials.  The  cross-mapping 
task  from  Experiment  1  was  used.  The  children 
were  first  asked  to  label  both  sets  of  objects 
using  the  family  labels  (this  was  repeated  ev¬ 
ery  second  trial),  and  then  the  full -labeling  pro¬ 
cedure  (e.g.,  "If  I  put  my  sticker  under  my  dad¬ 
dy  toy,  your  sticker  is  under  your  daddy  toy. 
Look,  my  sticker  is  under  my  daddy,  where  do 
you  think  your  sticker  is?”)  was  used.  The  par¬ 
ticipants  each  performed  14  sparse  trials  and 
14  rich  trials,  counterbalanced,  although  only 
their  performance  from  the  first  stimuli  type 
presented  was  analyzed. 

Results  and  Discussion 

The  use  of  the  "Daddy/Mommy/Baby”  la¬ 
bels  did  improve  young  children’s  ability  to 
make  relational  comparisons,  even  in  the  face 
of  a  tempting  object  choice.  When  trained  to 
use  these  labels,  3-year-old  children’s  ability 
to  map  relational  similarity  increased  dramati¬ 
cally  with  both  rich  and  sparse  stimulus  sets. 
As  can  be  seen  in  Figure  2, when  "Daddy/Mom¬ 
my/Baby”  was  applied  to  the  relation  of  mono¬ 
tonic  increase,  the  number  of  relational  re.spons- 
es  produced  by  3-year-olds  increased  from  54% 
with  the  rich  stimuli  and  32%  with  the  sparse 
in  Experiment  1  to  89%  and  79%,  respective¬ 
ly,  bringing  the  performance  of  these  partici¬ 
pants  to  the  level  of  performance  found  in  the 
5-year-olds.  Note,  however,  that  even  when  the 
relational  labels  were  used,  the  effect  of  object 
richness  was  replicated;  children  produced  sig¬ 
nificantly  more  relational  mappings  with  sparse 
objects  than  with  rich  objects.  Along  with  the 
increase  in  relational  mappings,  there  was  a 


concomitant  decrease  in  the  number  of  object 
identity  errors  between  Experiments  1  and  2 
(from  23%  to  8%  with  sparse  and  from  43%  to 
19%  with  rich). 

We  propose  that  "Daddy/Mommy/Baby” 
helped  the  young  children  notice  the  presence 
of  a  familiar  higher-order  relationship,  namely 
monotonic  change,  that  they  may  have  already 
represented.  Alternatively,  the  use  of  the  rela¬ 
tional  labels  may  have  led  children  to  align  the 
two  relational  systems  (the  E  set  and  the  C  set) 
and  derive  the  common  monotonicity  structure. 

In  Experiment  3,  we  address  three  further 
issues.  First,  we  asked  whether  relational  ad¬ 
jectives  such  as  "big/little/tiny"  would  also  en¬ 
hance  children’s’  ability  to  perform  a  relation¬ 
al  mapping.  Second,  we  tested  for  long-term 
representational  change  brought  about  by  our 
use  of  labels  by  retesting  a  sample  of  3-year- 
olds  1-4  months  after  initial  testing.  And  third, 
we  addressed  the  possibility  that  our  use  of  the 
"Daddy/Mommy^aby"  labels  on  every  trial  in 
Experiment  2  led  the  children  to  use  the  labels 
as  an  external  crutch — perhaps  following  the 
rule  "look  under  the  object  to  which  the  same 
label  has  been  applied”  without  grasping  the 
relationship  of  monotonic  increase  in  size.  To 
be  able  to  dismiss  this  possibility  we  presented 
children  with  a  small  number  of  full-label  tri¬ 
als  after  which  they  were  given  new  stimulus 
sets  and  asked  to  perform  the  cross-mapping 
without  labels  being  overtly  applied. 

EXPERIMENT  3 
Participants 

The  participants  were  51  3-year-olds,  28 
who  returned  to  the  laboratory  for  session  2. 
The  time  period  between  Session  1  and  Ses¬ 
sion  2  varied  from  1  month  to  4  months. 

Materials  and  Procedure 

Session  1.  The  children  were  randomly  as¬ 
signed  to  a  labeling  condition:  no-labels,  "Dad¬ 
dy/Mommy/Baby,”  or  "big/little/tiny.”  Chil¬ 
dren  were  given  the  label -training  task  used  in 
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Experiment  2,  and  then  eight  trials  using  either 
the  rich  or  the  sparse  stimulus  sets  and  the  full- 
label  procedure  of  the  previous  experiment. 
After  completing  the  labeled  trials  the  children 
were  shown  a  new  set  of  stimuli  of  the  same 
richness  type  and  were  given  eight  trials  with¬ 
out  labels. 

Session  2.  To  ensure  that  the  testing  situa¬ 
tion  in  this  later  session  was  as  different  as  pos¬ 
sible  from  the  initial  session  several  changes 
were  made;  (1)  the  children  were  tested  using 
the  opposite  type  of  stimuli  (i.e.,  rich  or  sparse) 
than  was  used  in  their  initial  testing  session; 
(2)  the  children  were  tested  in  a  different  test¬ 
ing  room,  and;  (3)  a  different  experimenter  per¬ 
formed  the  experiment.  The  instructions  given 
to  the  children  were  minimal;  they  were  remind¬ 
ed  that  they  had  played  this  game  before;  “Re¬ 
member,  you  played  a  Daddy/Mommy/Baby 
game  last  time.  Lets  see  if  you  can  still  play  the 
game.”  The  children  were  given  four  practice 
trials,  without  labels,  using  the  stuffed  penguins 
and  bears.  Each  child  was  then  presented  with 
eight  unlabeled  cross-mapping  trials,  followed 
by  four  “reminder”  trials  in  which  the  full-la¬ 
bel  procedure  was  used,  and  then  finally  with 
eight  more  unlabeled  trials. 

Results  and  Discussion 

Session  1.  Children  trained  with  the  rela¬ 
tional  labels  (“Daddy/Mommy/Baby”  and  “big/ 
little/tiny”)  produced  significantly  more  relation¬ 
al  responses  than  children  in  the  no-label  condi¬ 
tion  (58%  for  relational  labels  and  41%  for  no¬ 
label,  collapsed  across  stimulus  complexity  and 
trial  type).  The  effect  of  richness  was  replicat¬ 
ed;  children  produced  significantly  more  rela¬ 
tional  mappings  with  the  sparse  stimuli  than  with 
the  rich  stimuli  (67%  versus  39%,  collapsed 
across  label  types  and  trial  type).  And  as  in  the 
previous  experiments  the  children  produced  sig¬ 
nificantly  more  object  identity  errors  with  the 
rich  than  the  sparse  (37%  versus  20%). 

Session  2.  We  first  examined  children’s 
ability  to  map  monotonic  change  in  the  first 
eight,  non-labeled,  cross-mapping  trials.  This 
data  reflects  children’s  ability  to  apply  previ¬ 
ously  leaned  relational  structures  with  minimal 


prior  reminding.  The  “Daddy/Mommy/Baby” 
and  “big/Iittle/tiny”  labels  led  to  more  relational 
responses  on  these  trials  than  did  no  labels  (62% 
with  the  family  labels  and  54%  with  the  rela¬ 
tional  adjectives  versus  28%  with  no-labels), 
suggesting  that  the  children’s  previous  expo¬ 
sure  to  relational  labels  had  indeed  changed 
their  representation  of  monotonic  change.  The 
second  aspect  of  children’s  relational  perfor¬ 
mance  is  their  performance  on  the  second  set 
of  non-labeled  trials,  after  the  four  trial  “re¬ 
minder”  of  the  relational  labels.  Overall,  chil¬ 
dren’s  relational  responding  increased  after 
being  reminded  of  the  relational  labels  (67% 
correct  with  relational  labels,  versus  45%  cor¬ 
rect  with  no  labels). 

We  did  not  find  a  significant  effect  of  object 
richness  in  this  data  due  to  the  fact  these  children 
were  exposed  to  both  types  of  stimuli  across  the 
two  experimental  sessions.  It  seems  likely  that  this 
experience  diluted  the  eifect  of  object  richness 
found  in  the  previous  experiments. 

GENERAL  DISCUSSION 

A  robust  finding  in  the  study  of  children’s 
analogical  abilities  is  the  relational  shift  (Cent¬ 
ner,  1988;  Centner  &  Rattermann,  1991 ;  Centner 
and  Toupin,  1986;  Halford,  1993).  In  Experiment 
1  we  explicitly  tested  for  the  relational  shift  and 
found  that  the  presence  of  a  salient  object  simi¬ 
larity  choice  disrupted  relational  mapping  in  3- 
and  4-year-old  children,  but  that  5-year-olds  could 
map  relationally  despite  this  conflict,  supporting 
the  hypothesized  shift  from  objects  to  relations  in 
children’s  analogical  reasoning. 

In  addition  to  testing  for  the  relational  shift, 
we  also  made  predictions  specific  to  the  struc¬ 
ture-mapping  view  of  analogy.  The  predicted 
effect  of  object  richness,  one  of  the  most  robust 
findings  in  this  series  of  experiments,  derives 
directly  from  this  view.  We  propose  that  when 
performing  an  analogical  mapping,  children  (and 
adults)  will  begin  by  aligning  objects  based  on 
common  features,  and  further,  that  the  more  sa¬ 
lient  and  numerous  the  features,  the  more  likely 
that  object  matches  will  win  out  over  relational 
similarity  in  the  final  interpretations  (Markman 
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&  Centner,  1993).  In  each  of  our  experiments, 
the  presence  of  a  rich  object  conflict  was  more 
detrimental  to  the  ability  to  perform  a  relational 
mapping  than  the  presence  of  a  sparse  object 
conflict.  It  is  worth  noting  that  a  similar  effect 
has  been  found  in  the  performance  of  adults  pre¬ 
sented  with  a  cross-mapping  task.  Markman  and 
Centner  (1993)  found  that  adults  will  also  re¬ 
spond  based  on  object  similarity  when  the  num¬ 
ber  of  matching  object  attributes  of  the  cross- 
mapped  objects  is  increased. 

In  the  present  work  young  children’s  sus¬ 
ceptibility  to  rich  object  matches  was  due  to 
their  incomplete  knowledge  of  monotonic 
change.  We  propose  that  simply  using  the  la¬ 
bels  “Daddy/Mommy/Baby,”  invited  children 
to  represent  the  higher-order  relation  of  mono¬ 
tonic  change.  We  further  claim  that  the  addi¬ 
tion  of  this  relational  knowledge  led  to  a  strik¬ 
ing  improvement — equivalent  to  that  of  a  2- 
year-age  gain — in  the  children’s  ability  to  per¬ 
form  relational  mappings. 

Finally,  these  experiments  show  quite  force¬ 
fully  that  language,  and  in  particular  relational 
language,  can  facilitate  relational  representation. 
We  found  that  both  **Daddy/Mommy/Baby”  and 
**big/little/tiny”  led  to  increased  relational  re¬ 
sponding  in  our  three-year-olds,  and  that  this 
ability  remained  several  weeks  after  the  initial 
exposure  to  these  relational  labels.  The  role  of 
language,  we  suggest,  is  to  provide  an  invitation 
to  form  comparisons  and  further,  to  provide  an 
index  for  stable  memory  encoding  of  the  newly 
represented  relational  structure. 

IMPLICATIONS  FOR  THEORIES  OF 
ANALOGY 

The  research  of  Halford  and  his  colleagues 
(Halford,  1993;  Halford,  Smith,  Dickson,  May- 
bery,  Kelly,  Bain,  &  Stewart,  1995)  has  also 
found  the  shift  from  objects  to  relations.  They 
propose  that  an  important  driver  of  this  shift  is 
changes  in  cognitive  capacity.  That  is,  children 
show  a  developmental  increase  in  cognitive 
capacity  that  allows  them  to  represent  and  map 
increasingly  more  complex  matches.  Thus,  for 
example,  not  until  three  years  should  children 


be  able  to  carry  out  complex  system  matches. 
In  contrast,  in  our  account  it  is  domain  knowl¬ 
edge  that  leads  to  increases  in  children’s  ana¬ 
logical  abilities. 

Neither  view  of  analogy  is  meant  to  be  ex¬ 
clusive;  we  acknowledge  the  role  of  matura- 
lional  change  in  children’s  cognitive  abilities 
and  Halford  has  consistently  noted  the  role  of 
knowledge.  However,  our  results  demonstrate 
that  striking  changes  in  ability  can  occur  over 
the  course  of  one  experimental  session,  and 
further  that  these  gains  persist  after  the  experi¬ 
mental  session  is  over.  It  appears  that  the  lim¬ 
its  on  performance  are  not  in  children’s  capac¬ 
ity  to  represent  and  use  complex  relations,  but 
rather  in  whether  they  have  as  yet  represented 
a  given  complex  relation.  These  results  under¬ 
score  the  point  that  an  increase  in  relational 
responding  is  not  evidence,  in  itself,  of  matu- 
rational  gain. 

Another  prominent  theory  of  analogical 
reasoning  is  Goswami’s  (1993)  relational  pri¬ 
macy  view.  Goswami  proposes  that  very  young 
children  (3-years-old)  can  perform  an  analogy 
when  they  have  represented  the  requisite  rela¬ 
tional  structure.  While  we  agree  with  Goswa¬ 
mi  that  domain  knowledge  plays  a  crucial  role, 
we  differ  in  the  hypothesized  role  of  object  sim¬ 
ilarity.  Goswami  has  stated  that  “As  long  as  the 
relations  that  the  child  must  map  can  be  repre¬ 
sented....  then  performing  the  mapping  should 
present  little  difficulty,  and  this  should  hold  true 
whether  the  objects  to  be  mapped  are  similar 
or  different  in  appearance.”  (Goswami,  1995, 
p.  891).  However,  in  our  studies  there  is  still  a 
robust  effect  of  object  similarity,  even  when 
labels  have  been  applied  and  the  children’s  re¬ 
lational  performance  is  overall  very  good. 

Language  and  Relations 

We  have  presented  the  view  that  labels,  and 
in  particular  relational  labels,  invite  children  to 
notice  and  retain  patterns  of  elements;  language 
encourage  them  to  modify  thought.  When  ap¬ 
plied  across  a  set  of  cases  (or  a  pair  of  cases,  as 
here)  labels  provide  children  with  an  invitation 
to  make  comparisons,  and  then  provide  a  sys¬ 
tem  of  meanings  upon  which  to  base  these  com- 
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parisons  (Centner  &  Medina,  1997).The  results 
of  these  labeling  studies  support  this  view;  3- 
year-olds  trained  with  the  labels  “Daddy/Mom¬ 
my/Baby”  and  “big/little/tiny”  showed  a  sig¬ 
nificant  increase  in  relational  responding  with 
a  relatively  simple  linguistic  intervention. 

The  results  of  these  experiments  suggest 
that  young  children  can  perform  a  relational 
mapping,  even  in  the  presence  of  conflicting 
object  similarity,  when  familiar  labels  are  used 
to  highlight  the  appropriate  relational  struc¬ 
ture.  The  impressive  gains  in  ability  after  the 
use  of  relational  labels  supports  our  claim  that 
language  provides  an  invitation  for  children 
to  modify  their  thought.  Language  is  not,  how¬ 
ever,  the  only  path  to  relational  competence. 
Other  manipulations,  such  as  progressive 
alignment  in  which  children  are  presented  with 
easy  literal  similarity  matches  prior  to  diffi¬ 
cult  analogical  matches  will  also  lead  to  im¬ 
provement  (Kotovsky  &  Centner,  1997).  Work 
with  primates  has  also  shown  that  relational 
labels  need  not  be  embedded  in  a  full  linguis¬ 
tic  system  to  improve  relational  responding 
(Thompson,  Oden  &  Bpysen,  1997). 

Thus  we  conclude  that  one  factor  in  the 
development  of  the  ability  to  use  relational  sim¬ 
ilarity  is  the  acquisition  and  use  of  relational 
language.  Relational  language  can  serve  as  a 
catalyst  for  comparison  and  alignment  of  ob¬ 
jects  and  relations,  which  can,  in  turn,  provide 
a  mechanism  for  the  progression  from  chil¬ 
dren’s  naive  thought  to  the  sophisticated,  ab¬ 
stract  thought  of  adults. 
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INTRODUCTION: 

Analogical  reasoning  is  one  of  the  main 
theme  of  the  psychology  of  cognitive  develop¬ 
ment.  It’s  probably  because  it’s  a  constutive  de- 
velopmemental  mechanism  in  that  it  allows  the 
subject  to  construct  and  modify  his  knowledge 
in  a  flexible  and  adaptive  way  (Halford,  1993). 
Our  goal  is  to  analyse  the  conditions  favourind 
analogical  problem-solving  in  5-  to  6  years  old 
children. 

Historically,  studies  on  analogical  reason¬ 
ing  were  first  devoted  to  proportional  analo¬ 
gies  (a:b::c:d)  then  to  solving  new  problems 
(target  problems)  which  refer  to  known  prob¬ 
lems  (source  problems).  In  the  Piagetian  point 
of  view,  the  ability  to  reason  by  analogy  emerg¬ 
es  when  the  child  reaches  the  formal  stage.  More 
recently,  researchers  showed  that  younger  chil¬ 
dren  did  not  fail  because  they  are  inable  to  rea¬ 
son  by  analogy  but  because  their  lack  of  knowl¬ 
edge  about  objects  or  causal  relations  (Gos wa¬ 
rn!  &  Brown,  1989  ;  1990  and  Centner  &  Rat- 
terman  ,1991).  However  most  of  the  situations 
proposed  to  the  children  are  four  terms  analo¬ 
gies  which  only  allow  to  highlight  the  solution 
phase  process.  Other  situations  as  problem  anal¬ 
ogies  require  to  retrieve  the  source  before  to 
solve  the  target  problem. 

In  problem  solving  situations  different  fac¬ 
tors  can  contribute  to  improve  the  transfer  of 


the  solution.  Brown,  Kane  &  Long  (1989)  in¬ 
cite  their  subjects  to  extract  the  relational  struc¬ 
ture  common  to  the  two  situations.  Brown, 
Kane  &  Echols  (1986)  obtain  even  more  effi¬ 
ciency  in  helping  children  to  bring  out  the  goal 
structure  of  the  source  story.  Holyoak,  Junn  & 
Billman  (1984)  show,  in  their  experiment,  that 
when5-  to  6-  years  old  children  are  prompted 
to  map  the  target  on  the  goal  structure  they  find 
analogical  solution. 

On  the  whole  of  these  researches,  the  dem¬ 
onstration  of  analogical  reasoning  by  young 
children  has  been  obtained  in  very  competing 
conditions.  In  particular,  the  target  problem 
follows  immediatly  the  source  problem  and 
requires  an  explicit  intervention  of  the  adult. 

The  present  research  contends  that  the  clas¬ 
sical  design  used  in  the  investigation  of  analog¬ 
ical  problem-solving  undermines  young  chil¬ 
dren’s  abilities.  The  base  analog  is  usually  in¬ 
troduced  as  a  story  and  the  probability  of  extrac¬ 
tion  of  the  relational  structure  is  then  very  low. 

In  our  two  experiments,  the  goal  is  to  show 
that,  when  children  are  able  to  reach  an  opti¬ 
mal  level  of  representation  of  the  source  with¬ 
out  adult  explicitation  even  after  a  very  long 
delay  (one  week).  The  situations  proposed  to 
the  children  are  inspire  from  Holyoak  &  al  (ib.). 
However,  some  characteristics  have  been  mod¬ 
ified  (see  later). 
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EXPERIMENT  1 

56  children  of  kindergarden  from  a  pre¬ 
school  in  an  underprivileged  neighbourhood 
performed  the  source  problem  but  one  of  them 
being  absent  for  the  target  problem,  the  num¬ 
ber  is  reduced  to  55  subjects,  from  5;  1  years  to 
5;  10  years  (mean  age  5;  5  years). 

TASK  AND  MATERIAL 
The  task  consists  in  deposit  in  a  container 
placed  inside  a  bigger  container  pierced  of  an 
orifice,  objects  too  large  for  pass  by  the  orifice 
which  is  situated  to  the  diagonal  of  the  con¬ 
tainer.  It  is  necessary  to  addition  avoid  an  oth¬ 
er  container  placed  to  the  plumb  of  the  orifice. 
This  task  is  presented  in  two  analogous  prob¬ 
lems  (“mouse  problem”,  “boys  problem”)  that 
differentiate  by  their  indices  of  surface  but  that 
present  the  same  diagram  of  resolution.  The 
diagram  of  resolution  consists  therefore  in  co¬ 
ordinate  four  schemes:  to  crumble  (E),  to  roll 
(R),  to  join  (J)  and  to  make  pass  (P).  Each  of 
these  schemes  is  known  children  of  this  age. 
Note  that  the  plan  of  solution  is  here  more  com¬ 
plex  than  at  Holyoak  and  al.  1984.)  since  it  re¬ 
quires  one  more  move  (to  crumble). 

‘‘MOUSE  PROBLEM” 

On  a  table  is  posed  a  canister  semi-spher- 
ical  in  plastic  thick  and  transparent.  The  high 
of  the  canister  is  pierced  of  a  narrow  orifice. 
To  the  interior  and  under  the  orifice  is  placed 
a  small  pot  of  plastic  containing  water.  A 
small  plate  is  diametrically  opposite  the  pot 
(to  see  an  illustration  of  the  equipment,  an¬ 
nex  2).  The  house  of  the  mouse  is  placed  in 
the  face  of  the  child  of  such  manner  that  the 
hole  is  found  on  the  left  side.  On  the  table 
are  scattered:  a  slice  of  crumb  bread  and  var¬ 
ious  objects  (leaf  of  supple  transparent  plas¬ 
tic,  cube  from  woods  from  5  cm  of  side,  pen¬ 
cil,  ruler  of  wood  of  10  cm). 

One  tells  to  the  child  that  it  concerns  the 
house  of  a  small  smile  and  that  during  its  ab¬ 
sence  one  wants  to  put  it  the  bread  in  its  plate; 
one  must  pay  attention  that  the  bread  does  not 
go  in  the  glass  of  water. 


“BOYS  PROBLEM** 

On  a  low  table  is  posed  a  parallelepipcdic 
canister  in  black  cardboard  whose  anterior  face 
is  replaced  by  a  face  of  transparent  plastic.  In 
the  face  superior  of  the  canister  thcr  is  a  small 
orifice.  To  the  interior  and  under  the  orifice 
there  is  a  small  red  cardboard  canister  while  of 
the  other  side  is  placed  a  small  white  cardboard 
canister  (sec  annex  2).  The  canister  is  placed 
in  the  face  of  the  child  of  such  manner  that  the 
hole  is  found  on  the  right  side.  On  the  table  arc 
scattered:  a  bloc  of  small  cube  (Lego  0,5  cm.) 
encased  and  various  objects  (leaf  of  transpar¬ 
ent  supple  plastic,  cube  of  woods  of  5  cm.  of 
side,  pencil,  ruler  of  wood  of  10  cm). 

One  tells  to  the  child  that  it  concerns  a  bed¬ 
room  where  are  found  the  chest  to  toys  of  a 
wicked  boy  (red  canister)  and  that  a  nice  boy 
(white  canister).  One  wants  to  put  Lego  in  the 
canister  of  the  nice  boy  and  not  in  that  of  the 
wicked  boy. 

Procedure. 

In  the  source  situation,  children  are  confront¬ 
ed  with  the  resolution  of  a  complex  problem. 
Complex  means  here  that  the  diagram  of  solu¬ 
tion  is  not  known  spontaneously  by  children. 
However  each  step  of  the  of  solution  is  familiar. 

So  as  to  neutralize  possible  effects  linked 
to  the  content  of  problems,  we  have  alternated 
the  status  (source/target)  of  the  two  problems: 
for  half  of  subjects  the  source  problem  is  the 
mouse  problem  and  the  target  problem  is  the 
boy  problem  while  for  other  half  the  soured 
problem  is  the  boy  problem  and  the  target  prob¬ 
lem,  the  mouse  problem. 

In  a  first  time,  children,  distribute  in  two 
groups,  are  seen  individually  in  an  isolated  room 
and  arc  invited  to  solve  one  problem.  Children 
of  the  control  group  (n=  27)  try  to  solve  the  task 
and  no  assistance  is  provided  them.  The  task  is 
considered  as  ended  when  the  child  signals  it. 
To  children  of  the  experimental  group  (n=  28),  a 
guidance  is  brought  to  each  of  the  of  failed  step 
solution  (cf.  annex  3).  A  week  later,  one  pro¬ 
poses  to  all  subjects  the  second  problem.  To  the 
moment  of  the  target  problem,  a  leaf  of  alumin- 
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Control 

Group  (n  =  27) 

success 

success 

failure 

success 

success 

failure 

without  hint 

after  hint 

without  hint 

after  hint 

10 

11 

7 

q 

0 

27 

36% 

39% 

25% 

100% 

Table  L  Distribution  of  performances  on  the  target  problem  as  a  function  of  experimental  conditions. 


ium  replaces  the  transparent  plastic  leaf.  In  case 
of  impass,  only  one  hint  is  proposed:  “have  you 
already  told  some  thing  a  little  equal  that  could 
have  help  you?”.  The  hint  concerns  therefore 
only  an  assistance  for  the  evocation  and  in  no 
case  of  stressing  the  extraction  of  the  common 
structure  to  the  two  problems. 

All  actions  and  verbalizations  of  the  sub¬ 
jects  are  noted  by  the  experimentator. 

Results 

The  order  of  presentation  of  the  two  prob¬ 
lems  (mouse/  boys  and  boys/  mouse)  having 
no  significant  effect,  we  will  not  distinguish 
therefore  data  of  these  two  modes. 

Can-one  to  speak  analogical  transfer  be¬ 
tween  source  and  target? 

A  first  interesting  result  concerns  the  total 
absence  of  subjects  capable  to  produce,  with¬ 
out  assistance,  the  waited  solution  during  the 
problem  source.  Thus,  alone  subjects  of  the 
experimental  group  have  been  confronted,  with 
the  guidance,  to  the  progress  of  necessary  ac¬ 
tions  for  the  resolution  of  the  task. 

We  are  going  to  consider  now  the  distribu¬ 
tion  of  performances  during  of  the  target  prob¬ 
lem  (cf.  table  1). 

The  absence  of  success  in  the  control  group 
confronted  with  15%  of  success  in  the  experi¬ 
mental  group  suggests  that  the  former  are  made 
a  real  analogy  with  the  solution  of  the  problem 
source.  10  among  21  subjects  (is  48%)  having 
produced  the  waited  solution  have  succeeded 
without  hint.  If,  as  suggest  these  data,  the  suc¬ 
cess  to  the  target  problem  necessitates  the  evo¬ 
cation  of  the  problem  source,  one  can  be  inter¬ 
ested  in  verbal  expressions  of  this  evocation. 

Verbal  manifestations  of  the  evocation 
of  the  source. 


The  observation  of  the  gap  existing  be¬ 
tween  what  subjects  are  capable  to  complete 
and  the  reality  of  their  mental  functioning  is 
today  largely  admitted.  If  one  considers  the 
spontaneous  verbal  evocation,  we  note  that 
subjects  of  the  experimental  group  are  more 
numerous  to  make  this  evocation  (61  %  against 
41%).  The  difference  is  not  however  signifi¬ 
cant.  Table  3  shows  obviousnelys  the  inter¬ 
pretive  problem  that  puts  the  verbalization  of 
the  evocation:  children  can  evoke  verbally 
without  succeeding  (cf.  the  control  group)  or 
to  succeed  without  evoking  verbally  (cf.  the 
experimental  group). 

Discussion 

Performances  of  the  control  group  witness 
the  fact  that  the  resolution  of  the  type  of  pro¬ 
posed  problem  can  not  be  made  by  the  recov¬ 
ery  of  a  strategy  of  resolution  in  memory.  We 
have  therefore  well  there  a  situation  demand¬ 
ing  an  analogical  reasoning. 

The  resolution  of  the  target  problem  by  the 
majority  of  subjects  of  the  experimental  group 
puts  thus  clearly  in  obviousness  the  capacity  of 
children  of  5-6  years  to  solve  a  problem  by  anal¬ 
ogy  with  a  source  situation,  when  the  former  is 
itself  presented  in  the  form  a  resolution  of  prob¬ 
lem.  Such  a  result  suggests  that  we  have  thus 
created,  without  directive  induction  on  the  part 
of  the  adult,  conditions  allowing  subjects  to  rep¬ 
resent  the  relational  structureof  the  source  prob¬ 
lem.  Note  here  favorable  effects  to  the  transfer 
in  spite  the  long  period,  a  week,  between  the  res¬ 
olution  of  the  source  and  the  target. 

If  one  nears  these  results  of  these  of  Ho- 
lyoak&  al  (1984)  in  the  condition  magical  rug, 
one  notes  an  appreciably  equivalent  proportion 
of  subjects  that  find  spontaneously  the  solution 
(36%  here  and  30%  at  Holyoak&  al,  ib.).  The 
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greatest  efficiency  of  the  proposed  situation 
here  demonstrates  in  the  success  after  hinting. 
At  Holyoak&  ai,  no  child  solve  the  problem  in 
spite  of  the  hint  which  centers  him  explicitly 
on  the  structure  and  while  the  target  problem 
follows  immediately  the  source.  In  our  case, 
the  hinting,  simple  incentive  to  the  evocation, 
allows  39%  supplementary  of  subjects  to  trans¬ 
fer  the  solution  although  a  week  separates  situ¬ 
ation  source  and  target. 

Nevertheless,  although  that  the  two  situa¬ 
tion  -source  and  target-  here  differ  both  from 
the  point  of  view  of  the  presentation  of  the  prob¬ 
lem  and  available  resources  to  solve  them  (the 
type  of  sheet  to  roll,  criticical  point  for  the  res¬ 
olution,  differs  from  a  problem  to  the  other),  it 
remains  that  functional  and  perceptive  similar¬ 
ity  between  the  two  problems  appear  more  im¬ 
portant  than  in  the  condition  “magical  carpet/ 
sheet”  of  Holyoak  and  al.  (ib.). 

One  could  have  be  tempted  to  assimilate 
our  situations  to  those  use  by  Holyoak  and 
al.  in  condition  “magical  stick/  stick”  condi¬ 
tion  in  which  hint  seems  very  efficient  since 
they  drove  to  the  success  of  100%  of  sub¬ 
jects.  It  remains  that  in  this  condition,  per¬ 
ceptive  characteristics  of  the  stick  (close  to 
those  of  the  magical  stick)  are  precisely  those 
that  suggest  its  possible  function  to  know, 
near  a  too  distant  container.  In  our  study  on 
the  other  hand,  the  undeniable  perceptive 
similarity  between  a  sheet  of  aluminum  and 
a  sheet  of  plastic  docs  not  return  to  the  func¬ 
tion  of  tube.  Now,  as  have  underlined  it  Ho¬ 
lyoak  and  Thagard  (1995),  the  perceptive  in¬ 
dication  plug  is  guided  by  the  function. 

One  of  the  objectives  of  the  experience  2  is 
precisely  to  judge  the  weight  of  the  perceptive 
similarity  in  the  strong  rate  of  transfer  obtained. 

EXPERIMENT  2 

Besides  Holyoak  and  al.  (1984),  many  au¬ 
thors  (Gentner&Toupin,  I986;Goswami,  1992, 
for  a  review)  have  established  that  an  increase 
of  the  perceptive  similarity  between  source  and 
target  favors  the  analogical  transfer  at  children 
-one  observes  it  also  at  adults. 


One  can  therefore  offer  a  second  interpre¬ 
tive  hypothesis  to  the  strong  rate  of  transfer 
obtained  in  the  experience  1 .  It  would  not  be 
for  the  essential  the  result  of  possible  produc¬ 
tive  conditions  the  representation  of  the  rela¬ 
tional  structure  of  the  source  but  well  rather  than 
of  a  similarity  such,  that  it  would  render  very 
probable  the  evocation  of  the  source  then  the 
mapping  between  the  two  situations. 

Results  of  Brown  and  al  .(1986)  on  analog¬ 
ical  situations  in  which  available  resources  be¬ 
tween  source  and  target  were  strictly  identical, 
suggest  that  the  similarity  is  not  a  sufficient 
condition  to  the  transfer.  It  appears  neverthe¬ 
less  important  to  test  experimentally  a  such  al¬ 
ternative  hypothesis  to  that  that  we  favor. 

We  have  thus  confronted  the  condition 
studied  in  the  experience  I ,  “source  problem/ 
target  problem”  (P-P  Condition)  on  the  one 
hand,  to  the  mode  of  “classic”  presentation 
(Holyoak  and  al.,  ib.),  “history  source/ target 
problem”  (H-P  condition)  and,  on  the  other 
hand,  to  a  condition  “history  source  mimiced 
by  the  experimentator/target  problem”  (con¬ 
dition  HM-P).  The  confrontation  of  the  con¬ 
dition  P-P  to  the  alone  “classic”  H-P  condi¬ 
tion  is  still  ambiguous  on  the  plan  of  inter¬ 
pretations.  Indeed,  besides  the  fact  that  in  the 
first,  subjects  have  more  the  possibility  to  as¬ 
similate  the  structure,  the  totality  of  objects 
that  they  have  to  their  disposition  to  solve 
the  problem  source  is  perceptually  identical 
(except  for  the  sheet  to  roll)  to  the  available 
objects  during  the  resolution  of  the  target 
problem.  This  is  not  the  case  in  the  H-P  con¬ 
dition  in  which  only  an  illustrated  book  is 
read  to  children.  One  has  therefore  there  pos¬ 
sibly  a  strictly  perceptive  indication  being 
able  to  favor  the  analogy.  The  HM-P  condi¬ 
tion  would  have  to  allow  to  slice  between 
these  two  hypotheses.  It  is  only  in  the  case 
where  one  would  obtain  a  significant  superi¬ 
ority  of  the  P-P  condition  both  on  H-P  and 
HM-P  conditions  that  we  could  have  reject 
an  interpretation  resting  exclusively  on  a  per¬ 
ceptive  facilitation  linked  to  the  objects. 

An  other  manner  to  push  more  before  the 
study  of  the  role  of  the  perceptive  proximity 
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degree  between  source  and  target  in  the  pro¬ 
duction  of  a  transfer,  consists  in  make  it  vary 
experimentally,  non  from  the  point  of  view  of 
available  resources  for  the  resolution,  but  from 
the  point  of  view  of  the  “environment”  (deco¬ 
ration)  in  which  the  problem  targets  is  posed. 

Finally,  although  we  have  noted,  during 
the  first  experience,  a  connection  between 
verbal  production  (in  the  occurrence,  evoca¬ 
tion  of  the  source)  and  analogical  transfer, 
we  wished  to  explore  the  role  of  a  systematic 
verbal  production  asked  to  subjects  to  the  exit 
of  the  source  situation.  It  will  concern  to  ask 
them  to”  repeat  “the  history  that  one  comes 
to  tell  them  (conditions  H-P  and  ++HM-P) 
or  what  one  comes  to  make  (condition  P-P). 
Such  a  verbal  restitution  task  has  been  em¬ 
ployed  by  Brown  and  al.  (1986).  These  au¬ 
thors  have  observed  that  the  restitution  in  it 
even  had  no  effect  on  the  then  even  transfer 
that  it  took  place  immediately  before  the  res¬ 
olution  of  the  problem  targets  (since  source 
and  target  followed  immediately).  We  will 
tempt  an  analysis  in  this  senses  in  observing 
nevertheless  that  the  importance  of  the  peri¬ 
od  (a  week)  that  we  impose  on  children  would 
have  tender  to  decrease  again  the  effect  of 
the  restitution. 

Thus  three  experimental  factors  are  ma¬ 
nipulated  in  the  experience  2  driving  to  3*  2* 
2=  12  experimental  groups:  the  mode  of  pre¬ 
sentation  of  the  source:  P-P/  H-P/  HM-P;  the 
perceptive  similarity  degree  between  source 
and  target:  close/  far;  demand  or  non  of  a  res¬ 
titution  of  the  phase  source. 

Subjects 

183  children  (mean  age:  5;  6  years)  have 
participated  in  this  second  experiment.  They 
come  from  10  different  schools  inserted  in  an 
standard  socio-economic  environment.  Chil¬ 
dren  of  a  same  classroom  are  distributed  in 
equivalent  manner  in  each  of  12  experimental 
groups.  But  as  the  absence  of  some  of  our  sub¬ 
jects  to  the  moment  of  the  problem  targets, 
the  number  of  subjects  is  not  strictly  identical 
in  each  of  groups. 


Material 

The  material  of  the  experience  1  has  been 
completed  by  the  material  serving  to  the  mode 
“far”  of  the  factor  degree  of  similarity  source/ 
target  (cf.  Table  1).  An  elephant  wiht  on  its  back 
a  basket  pierced  of  a  small  hole,  is  placed  in  the 
low  of  a  mountain,  on  the  external  bank  of  a  riv¬ 
er  materialized  between  the  low  of  the  moun¬ 
tain  and  the  elephant.  On  the  mountain  holds  a 
small  doll  holding  a  slice  of  crumb  bread.  The 
problem  consists  in  help  the  doll  to  put  the  bread 
in  the  basket  bf  the  elephant,  without  that  the 
bread  falls  to  earth  or  in  the  river  .  In  a  percep¬ 
tive  point  of  view,  the  “elephant  “  problem  is 
different  the  “boys”  problem.  On  the  one  hand 
objects  of  the  situation  are  not  in  a  closed  envi¬ 
ronment,  on  the  other  hand,  the  size  of  each  ob¬ 
ject  is  appreciably  greater. 

Procedure 

The  absence  of  effect,  in  the  experience  1 ,  in 
the  order  of  presentation  of  problems,  has  behaved 
us  to  use  the  same  “boys”  problem  as  source  for 
all  subjects.  The  guidance  to  the  solution,  in  the 
problem  source,  is  identical  to  that  the  experience 
1.  In  condition  problem  and  history  resolution 
mimiced  by  the  experimentatot,  various  objects 
are  had  on  the  table  (pot  of  yoghourt,  trombone, 
pencil,  string  of  10  cm.,  gum,  transparent  plastic 
sheet).  The  solution  is  shown  in  moving  the  sheet 
of  plastic.  In  condition  read  history,  an  illustrated 
handbook  serves  as  support  to  the  narration  (to 
see  annex  5  the  text  of  the  history). 

For  the  target  problem,  one  replaces  the 
transparent  plastic  sheet  of  the  source  by  a  sheet 
of  aluminium.  Two  degrees  of  hint  are  planned 
if  the  child  does  not  find  spontaneously  the 
solution.Hint  1  suggests  the  necessity  of  the 
evocation:”  One  you  has  already  told  a  history 
that  could  have  help  you?  “.  Hint  2  incites  to 
evoke  the  history  source:”  The  last  week  one 
has  told  you  the  history  of  a  nice  boy  and  a 
wicked  boy  “.  Note  that  these  two  relaunchings 
remain  less  directives  than  those  Holyoak  &  al 
(ib.)  where  authors  evoked  first  the  history  that 
came  to  be  told  then  cneter  the  attention  on  what 
had  made  the  genious. 
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Results 

Results  of  experiment  2  are  presented  in 
the  table  2.  Only  subjects  having  strictly  suc¬ 
ceeded  to  solve  the  problem  targets  in  chaining 
4  schemes  Crumble,  Roll,  Join  and  Pass  are 
accounted  (some  children  have  rolled  the  sheet 
of  aluminium  then  have  failed). 

In  lines,  the  conditions  of  presentation  of 
the  source  problem  ;  in  columns,  the  perfor¬ 
mances  on  the  target  problem. 

The  examination  of  the  number  of  success 
to  the  target  problem  in  the  three  conditions 
reveals,  in  accordance  with  the  hypothesis  that 
we  privilege,  a  significant  effect  of  the  factor 
condition  of  presentation  of  the  source  (Chi 
squared=  29,27;  p<.0001).  Especially,  the  con¬ 
dition  P-P  establishes  more  efficient  than  each 
the  two  other  conditions  (P-P  vs  HM-P:  Chi 
squared=  4,661;  p=.0309;  P-P  vs  H-P:  Chi 
squared=  10,561 ;  p=.001 2).  One  observes  more, 
a  greatest  number  of  successes  in  the  condition 
where  the  history  source  is  mimiced  that  in  that 
where  it  is  read  (HM-P  vs  H:  Chi  squared= 
9,735;  p=.  0018).  The  hierarchy  between  the 
three  mode  presentation  of  the  source  is  found 
unchanged  when  one  considers  proportions  of 
success  without  hint. 


Globally,  the  proportion  of  success  to  the 
target  problem  docs  not  reveal  difference  be¬ 
tween  the  two  perceptive  proximity  degrees 
(mouse  vs  elephant)  between  source  and  target: 
^is  proportion  is  59%  in  close  proximity  condi¬ 
tion  and  53%  in  condition  of  weaker  proximity. 
Taking  in  counts  the  moment  of  the  success,  one 
notes  nevertheless  a  number  of  success,  without 
hint,  significativly  more  high  when  the  proxim¬ 
ity  between  source  and  target  is  the  close  (Chi 
squarcd=  5,1 18;  p-02).  This  superiority  re.sults 
in  fact  from  the  alone  condition  P-P. 

The  restitution  asked  at  the  end  of  the  phase 
source  produces  no  global  significant  effect  on 
performances.  One  observes  however  in  the 
condition  H-P  a  tendency  to  what  subjects  hav¬ 
ing  had  to  restitute  the  history  are  more  numer¬ 
ous  to  solve  the  problem  targets  (corrected 
squared  Chi=  3,254;  p=.07 1 2).This  result  could 
have  possibly  be  found  reservations  according 
to  the  quality  of  restitutions.  We  have  distin¬ 
guished  three  types  of  restitution  :  these  that 
clarify  clearly  the  totality  of  the  solution;  these 
that  mention  only  some  schemes  but  the  critic 
one  “to  roll”  and  finally  these  that  mention  only 
some  elements  of  material. 

These  results  does  not  allow  to  reveal  a 
value  forecasts  the  quality  of  the  restitution.  One 
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table  2.  Distribution  of  performances  on  the  target  problem  as  a  funcHon  of  the  conditions  of  presentation  of  the 
source  problem  and  the  degree  of  perceptual  slmOarity  between  base  and  target. 
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observes  only  a  tendency  to  what  the  success 
without  hint  is  less  probable  at  subjects  having 
restored  only  elements  of  material.  In  the  con¬ 
dition  H-P  where  the  restitution  seemed  to  fa¬ 
vor  the  performance,  one  observes  that  to  an 
alone  exception  near,  all  subjects  solving  the 
problem  targets  are  these  that  have  verbally 
extracted  the  structure  of  the  problem  source 
during  the  restitution. 

Furthermore,  the  quality  of  restitutions  does 
not  allow  to  distinguish  subjects  according  to 
the  condition  of  processing  of  the  source. 

DISCUSSION  OF  EXPERIMENT  2 

This  second  experiment  reinforces  the  hy¬ 
pothesis  according  to  whether  it  is  well  the  qual¬ 
ity  of  processing  that  allows  the  activity  of  prob¬ 
lem  resolution  in  source  that  would  be  respon¬ 
sible  the  good  performances  observed  on  the 
target  during  the  experience  1.  In  accordance 
with  a  classic  data  of  the  literature,  one  observes 
however  a  share  of  facilitation  linked  to  the 
perceptive  proximity  degree.  More  precisely, 
it  appears  here  that  a  greatest  proximity  favors 
the  evocation  of  the  source  since  subjects  are 
then  less  numerous  to  to  have  need  hints  incit¬ 
ing  them  to  evoke  the  source. 

The  global  effect  absence  of  the  restitution 
establishes  true  to  already  reminded  results  of 
Brown  and  al.  (1986).  It  is  interesting  to  note 
that  one  observes  an  effect  in  the  condition  by 
less  favorable  hypothesis  to  the  centration  of 
subjects  on  the  structure  of  the  problem  source 
(condition  H-P).  It  would  seem  therefore  that 
the  task  of  restitution  could  contribute  to  this 
focalisation  subjects  while  the  read  history  did 
not  the  incite  there.  This  interpretation  is  rein¬ 
forced  by  the  observation  according  to  wheth¬ 
er,  the  quasi  totality  of  children  making  analo¬ 
gy  in  this  condition  are  these  that  have  clari¬ 
fied  completely  the  structure  of  the  problem. 

In  the  two  other  conditions  of  presentation 
of  the  source,  one  does  not  find  differences 
linked  to  the  quality  of  the  restitution.  This 
observation  could  have  partly  result  from  the 
long  period  between  the  moment  of  the  restitu¬ 
tion  and  the  resolution  of  the  target  (a  week). 
But,  in  our  opinion,  more  fundamentally,  it  puts 


a  time  news  the  question  of  the  verbal  data  re¬ 
liability  as  reflection  of  the  processing  under¬ 
taken  by  subjects. 

GENERAL  DISCUSSION 

Since  the  middle  of  80s,  date  of  the  pilot 
study  of  Holy  oak  and  al.  (ib.),  the  vision  of 
psychologists  on  capacities  of  kindergarmers 
to  solve  problems  by  analogy  establishes  ap¬ 
preciably  more  optimistic  (cf  Holyoak  & 
Thagard  1995). 

The  totality  of  presented  data  here  demon¬ 
strates  the  capacity  of  children  of  5-6  years  to 
solve  problems  by  analogy.  This  demonstration 
establishes  particularly  convincing  and  this  to 
more  of  a  title. 

On  the  one  hand,  the  strong  rate  of  trans¬ 
fer  obtained  in  the  condition  problem-prob¬ 
lem,  it  has  been  in  spite  an  extraordinarily  long 
period  between  situation  source  and  problem 
targets.  To  our  knowledge,  no  study  had 
tempted  to  impose  such  a  period  to  children, 
privileging  situations  where  source  and  target 
follow  immediately. 

On  the  other  hand,  as  we  have  reminded  it 
in  introduction,  factors  studied  until  here  for 
their  efficiency  in  the  analogical  transfer  of  the 
young  children  imply  quasi-systematically  a 
very  explicit  intervention  of  the  adult  driving 
children  to  extract  the  relational  structure  in  the 
situation  source.  One  will  retain  that  here  hints 
did  not  go  beyond  an  assistance  with  the  evo¬ 
cation  of  the  source.  This  difference  is  obvi¬ 
ously  fundamental  to  the  extent  of,  as  under¬ 
line  Holyoak  &  Thagard  (1995),  the  period  of 
4  to  6  years  constitutes  precisely  a  phase  from 
transition  in  the  course  of  which  elaborates  the 
capacity  of  mapping  of  systems  of  relationships 
and  only  relationships  of  first  order.  In  brought 
studies  here,  the  mapping  has  never  been  ex¬ 
plicitly  suggested.  More,  the  superiority  of  the 
processing  of  the  source  in  resolution  of  prob¬ 
lem  as  compared  to  a  history  mimiced  and  told 
source  allows  to  reject  the  hypothesis  of  an  help 
with  themapping  provided  by  the  alone  percep¬ 
tive  similarity  between  resources  proposed  in 
source  and  in  target. 


289 


M.  Bastlen-Tonlazzo,  A.  Blayc,  D.  Cayol 


Thus,  the  appropriation  of  the  source  in  the 
form  of  a  resolution  of  problem  establishes  a 
sufficient  condition  to  allow  an  analogical  trans¬ 
fer  to  a  majority  of  children  of  5-6  years,  even 
in  the  situation  of  lesser  perceptive  proximity 
between  the  two  problems.  The  absence  of  dif¬ 
ference  concerning  the  verbal  restitution  qual¬ 
ity  between  the  different  conditions  of  process¬ 
ing  of  the  source  does  not  allow  to  exhibit  a 
better  explicitation  of  the  structure  of  purpose 
of  the  source  at  subjects  placed  in  Problem- 
Problem.  This  point  of  view,  we  have  obtained 
a  direct  validation  of  the  hypothesis  according 
to  whether  it  is  because  it  allows  a  focalisation 
on  the  relational  structure  that  the  resolution  of 
the  source  is  a  favorable  condition  to  the  trans¬ 
fer.  Nevertheless,  the  verbal  restitution  activi¬ 
ty  demands  a  level  of  explicitation  (in  the  sense 
of  Karmiloff-Smith,  1992)  of  the  representa¬ 
tion  that  has  elaborate  the  subject  that  docs  not 
seem  a  necessary  condition  for  the  possibility 
of  a  mapping  with  the  problem  targets. 

As  has  proposed  it  Karmiloff-Smith,  to  solve 
a  problem  by  analogy  requires  effectively  a  ‘‘rep¬ 
resentational  redescription”  of  the  source  so  as 
to  to  adapt  it  to  the  resolution  of  the  target.  We 
know,  according  to  this  author,  that  a  prelimi¬ 
nary  condition  to  the  possibility  of  redescription 
is  the  “comportemental  mastery”  of  the  initial 
situation.  Studies  presented  in  this  article  rein¬ 
force  this  thesis  and  suggest  that  one  of  the  best 
ways  to  acquire  this  mastery  consists  precisely 
in  place  subjects  in  situation  of  guided  problem 
resolution  of  the  source. 
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ABSTRACT 

In  understanding  a  metaphorical  utter¬ 
ance,  there  is  the  question  of  how  to  use  the 
analogical  mapping  (if  any)  associated  with 
the  metaphor,  once  this  mapping  is  known. 
It  is  usually  assumed  that  one  should  trans¬ 
late  the  situation  literally  depicted  by  the  ut¬ 
terance  into  terms  of  the  target  domain,  and 
that  this  requires  extending  the  mapping  to 
source  items  and  structure  that  are  not  yet 
mapped  by  the  analogy.  However,  this  paper 
argues  that  it  is  mistake  to  think  that  such 
extension  must  generally  be  done.  This  mis¬ 
take  arises  from  an  unworkable  assumption 
that  metaphorical  utterances  must  generally 
be  assigned  meanings  other  than  their  literal 
ones.  Instead,  the  paper  advocates  an  ap¬ 
proach  that  treats  a  literal  meaning  as  a  basis 
for  an  indefinite  amount  of  within-source  in¬ 
ference  connecting  with  the  existing  analog¬ 
ical  mapping,  but  does  not  seek  to  extend  it. 
This  approach  has  been  implemented,  in  a 
reasoning  system  called  ATT-Meta. 

INTRODUCTION  AND  APPROACH 

Consider  an  utterance  of  the  sentence 
“Those  two  ideas  were  in  different  corners 
of  John’s  mind.”  This  rests  upoii  the  meta¬ 
phors  of  MIND  AS  PHYSICAL  SPACE  and 
IDEAS  AS  PHYSICAL  OBJECTS.  In  this 
paper,  metaphors  are  conceptual  views,  not 
utterances  or  parts  of  utterances  (cf.  Lakoff 
1993).  Rather,  the  utterance  is  merely  one 
possible  linguistic  “manifestation”  of  the 


metaphor(s).  I  assume  that  “corners”  has  as 
one  of  its  primary  literal  meanings  the  cor¬ 
ners  of  a  room,  and  that  therefore  John’s  mind 
is  being  metaphorically  viewed  as  being  or 
containing  a  room.  Let’s  suppose  that  the 
addressee  (hearer,  reader)  of  the  utterance  is 
familiar  with  the  two  metaphors  mentioned, 
but  has  not  before  encountered  the  idea  of 
something  being  in  a  “corner”  of  someone’s 
mind.  How  is  the  addressee  to  make  sense  of 
the  sentence? 

Let’s  assume  that  the  addressee  takes  the 
analogy  underlying  MIND  AS  PHYSICAL 
SPACE  to  include  the  following:  a  mind  cor¬ 
responds  to  a  bounded  or  unbounded  physi¬ 
cal  region;  and  mental  entities  or  events  cor¬ 
respond  to  entities  or  events  located  in  that 
region.  I  will  assume  that  for  the  addressee 
the  analogy  underlying  IDEAS  AS  PHYSI¬ 
CAL  Objects  the  includes  the  following: 
someone’s  ideas  (special  case  of  mental  ob¬ 
jects)  correspond  to  some  physical  objects; 
and  inferential  interaction  between  mental 
objects  or  events  corresponds  to  physical  in¬ 
teraction  of  the  corresponding  physical  enti¬ 
ties.  But  let’s  suppose  that  the  analogies  say 
nothing  specifically  about  what  it  is  about  the 
mind  that  “corners”  correspond  to,  or  how 
being  in  “corners”  could  possibly  be  signifi¬ 
cant  for  mental  entities. 

Consider  an  utterance  that  manifests  some 
metaphors.  Let  M  be  a  subset  of  these  meta¬ 
phors.  A  “source-based”  meaning  of  the  utter¬ 
ance,  with  respect  to  M,  is  one  that  arises  from 
treating  the  metaphors  in  M  as  objectively  true 
views.  That  is,  in  our  corners  example,  the 
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source-based  meaning  with  respect  to  both  the 
MIND  AS  PHYSICAL  SPACE  and  IDEAS  AS 
PHYSICAL  OBJECTS  metaphors  casts  the 
mentioned  ideas  as  literally  being  physically 
situated  in  physical  comers  that  are  literally 
physically  inside  the  person’s  mind.  By  con¬ 
trast,  a  “target-based”  meaning  of  the  utterance, 
with  respect  to  M,  would  cash  out  the  meta¬ 
phors  in  M  in  terms  of  of  their  targets.  To  switch 
examples,  the  target-based  meaning  of  ‘The 
idea  was  in  John’s  mind”,  with  respect  to  the 
MIND  AS  PHYSICAL  SPACE  metaphor, 
could  be  that  John  was  considering  the  idea  in 
some  way.  In  cases  where  an  utterance  mani¬ 
fests  only  one  metaphor,  source-based  and  tar¬ 
get-based  meanings  can  be  called  “literal”  and 
“metaphorical”  meanings  respectively. 

Many  metaphor  theorists  appear  to  assume 
that  (A)  the  goal  of  metaphorical-utterance 
processing  is  to  construct  a  representation  of 
a  target-based  meaning.  (For  instance,  much 
psychological  research  on  metaphor  makes 
reference  to  the  “metaphorical”  —  i.c.,  target- 
based  —  meaning,  tacitly  assuming  that  it  ex¬ 
ists  and  needs  to  be  determined.)  It  then  ap¬ 
pears  obvious  that  (B)  this  representation 
should  contain  elements  that  correspond  to  the 
major  source-domain  elements  that  appear  in 
the  utterance.  So,  in  our  example,  the  repre¬ 
sentation  should  involve  aspects  of  John’s 
mind  that  are  being  viewed  as  comers,  in  some 
suitable  extension  of  the  analogy  involved  in 
MIND  AS  PHYSICAL  SPACE.  Also,  the 
property  of  something  being  in  a  comer  would 
have  to  map  over  as  well,  to  some  property  of 
ideas.  Since  the  analogies  behind  the  two  met¬ 
aphors  manifested  by  the  comers  utterance 
have  nothing  to  say  directly  about  comers,  we 
have  a  case  of  unmapped  source-domain  enti¬ 
ties  and  associated  structure  being  mapped 
(transferred)  to  the  target  domain. 

Indeed,  analogy  processing  is  classically 
divided  up  into  retrieval,  matching,  transfer, 
evaluation  and  so  forth,  with  transfer  being  cen¬ 
tral  when  an  analogy  is  used  creatively,  such  as 
when  an  utterance  is  a  novel  manifestation  of  a 
familiar  metaphor.  The  purpose  of  the  transfer 
phase  is  typically  to  transfer  (in  adapted  form) 


as  much  unmapped  structure  as  possible  from 
the  source  to  the  target.  A  common  variant  on 
this  idea  is  to  do  transfer  in  a  goal -directed  way, 
where  the  goals  arise  from  reasoning  tasks  in 
the  target  domain.  Nevertheless  the  idea  is  still 
that  of  mapping  unmapped  structure. 

In  Martin’s  (1990)  work,  the  system  at¬ 
tempts  to  map  so-far  unmapped  source  items 
to  the  target.  The  SAPPER  system  (Veale  & 
Keane  1997)  is  prolific  in  its  attempts  to  ex¬ 
tend  mappings.  Grady  (1997),  in  di.scussing  the 
THEORIES  AS  BUILDINGS  metaphor, 
points  out  that  things  such  as  French  windows 
or  tenants  in  a  building  have  no  natural  corre¬ 
spondence  with  anything  in  theories,  and  im¬ 
plies  that  in  order  to  make  sense  of  sentences 
mentioning  the  French  windows  or  tenants  of 
a  theory  the  understander  must  use  additional 
metaphors  to  map  the  French  windows  or 
whatever.  He  therefore  seems  to  subscribe  to 
forms  of  (A)  and  (B).  On  the  other  hand,  as  an 
exception  to  the  present  comments,  Hobbs 
(1990)  appears  not  subscribe  to  (A)  and  (B)  in 
dealing  with  familiar  metaphor;  indeed,  the 
approach  in  this  paper  has  some  strong  simi¬ 
larities  to  his.  (Of  course,  if  the  task  is  to  deal 
with  entirely  novel  metaphors  in  an  utterance, 
as  opposed  to  novel  manifestations  of  famil¬ 
iar  metaphors,  then  unmapped  structure  must 
perforce  be  mapped.  This  paper  is  not  about 
dealing  with  novel  metaphors.) 

Going  back  to  our  “comers”  example,  my 
claim  is  that  it  is  extremely  hard,  if  not  im¬ 
possible,  to  come  up  with  a  convincing  map¬ 
ping  from  physical  comers  to  aspects  of  minds, 
even  though  the  utterance  is  readily  under¬ 
standable.  We  simply  do  not  know  enough, 
whether  commonsensically  or  scientifically, 
about  how  the  mind  works.  (And  I  hope  that  it 
is  fairly  obvious  from  what  follows  that  if  there 
isn’t  anything  in  John’s  mind  that  can  confi¬ 
dently  be  cast  as  “comers”  then  there’s  no 
point,  from  the  point  of  view  of  discourse  un¬ 
derstanding,  imposing  on  John’s  mind  the  stip¬ 
ulation  that  it  contain  something  correspond¬ 
ing  to  comers.)  The  claim  I  am  making  is  Just 
an  expression  of  the  notorious  unparaph ras- 
ability  of  many  metaphorical  utterances.  How- 
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ever,  the  notoriety  has  not  gained  sufficient 
respect  among  researchers  who  actually  devise 
metaphor-processing  schemes. 

As  a  variant  on  the  comers  utterance,  one 
could  say  “Those  ideas  were  in  different  recess¬ 
es  of  John’s  mind.”  If  one  attempts  to  map  cor¬ 
ners,  one  should  presumably  also  attempt  to 
map  recesses.  The  utterances  have  much  the 
same  effect  but  are  subtly  different  in  their  con¬ 
notations.  The  recesses  sentence  conveys  more 
strongly  that  the  ideas  were  “hidden  away”  or 
inaccessible.  But  do  we  know  enough  about  the 
mind  to  say  whether  comers  and  recesses  them¬ 
selves  should  map  to  subtly  different  aspects 
of  John’s  mind,  and  to  say  what  those  aspects 
are?  (Answer:  no.)  Isn’t  it  rather  that  this  new 
example  and  the  old  one  convey  that  the  ideas 
in  question  were  in  somewhat  subordinate,  hid¬ 
den  or  inaccessible  positions  in  John’s  mind, 
where  moreover  those  positions  were  relative¬ 
ly  inaccessible  or  distant  from  each  other,  so 
that  physical  interaction  between  the  ideas  was 
unlikely?  If  so,  then  the  mapping  mentioned 
above  between  physical  interaction  and  infer¬ 
ential  interaction  comes  into  play,  to  generate 
the  connotation  that  the  ideas  (probably)  did 
not  inferentially  interact. 

The  process  of  going  from  the  ideas  being 
in  comers  or  recesses  to  the  hypothesis  that  the 
ideas  do  not  physically  interact  is  one  of  with- 
in-source  reasoning  (or  within-vehicle  reason¬ 
ing  as  I  have  called  it  elsewhere)  conducted  on 
the  basis  of  the  source-based  meaning  of  the 
utterance.  The  amount  of  within-source  reason¬ 
ing  needed  to  meet  a  source  element  such  as 
physical  interaction  that  is  mapped  over  to  the 
target  is  not  in  principle  limited. 

But  in  special  cases  no  within-source  in¬ 
ference  may  be  needed  at  all  —  if  the  source- 
based  meaning  is  one  that  can  be  directly 
mapped  by  the  existing  analogical  mapping.  For 
example,  the  utterance  might  be  “these  ideas 
were  in  John’s  mind,”  which  might  be  imme¬ 
diately  mappable  by  the  analogy  to  the  propo¬ 
sition  that  John  was  considering  the  ideas.  This 
proposition  can  then  be  called  the  target-based 
meaning.  But  in  general  the  advocated  approach 
does  not  lead  to  anything  that  can  straightfor¬ 


wardly  be  called  the  target-based  meaning.  That 
is,  the  approach  is  semantically  agnostic  with 
respect  to  target-based  meanings.  It  is  akin  to 
but  less  extreme  than  the  viewpoint  of  David¬ 
son  (1979).  Davidson  likewise  puts  great  weight 
on  connotations  drawn  from  the  source-based 
meaning  by  pragmatic  processes.  However, 
unlike  the  case  in  Davidson  there  is  no  objec¬ 
tion  in  the  present  paper’s  approach  to  some¬ 
one  taking  the  terminological  stance  that,  say, 
any  set  of  inferences  that  happens  to  be  drawn 
from  the  Source-based  meaning  of  an  utterance 
in  context  constitutes  the  target-based  mean¬ 
ing  (in  context)  for  the  utterance. 

Fortunately,  to  accompany  the  claim  that 
target-based  meanings  (in  a  more  traditional 
sense  than  just  imagined)  often  cannot  be 
found,  I  claim  that  it  is  often  or  perhaps  even 
typically  not  necessary  to  find  them  in  the  first 
place.  What  is  important  is  for  any  given  met¬ 
aphorical  sentence  to  contribute  information 
to  the  overall  discourse.  Whether  it  does  so 
through  the  medium  of  target-based  meanings 
or  in  the  looser  way  described  above  is  a  sec¬ 
ondary  issue.  Notice  that,  in  practice,  a  sen¬ 
tence  like  “Those  two  ideas  were  in  different 
comers  of  John’s  mind”  will  be  in  some  con¬ 
text  where  it  matters  that  the  ideas  in  question 
are  in  different  comers.  For  instance,  the  dis¬ 
course  might  be  of  the  form  “...  John  didn’t 
see  the  consequences  of  [some  ideas].  They 
were  in  different  corners  of  his  mind....”  If  the 
understander  is  predisposed  to  seek  coherence 
relationships  between  sentences,  and  tries  the 
idea  that  the  second  is  an  explanation  for  the 
first,  then  understander  will  be  primed  to  in¬ 
vestigate  John’s  drawing  of  inferences  from 
the  ideas.  Thus,  the  above  within-source  in¬ 
ferences  could  be  constructed  backwards,  in  a 
goal  directed  sense,  meeting  up  eventually 
with  the  source-based  meaning. 

To  go  back  to  the  discussion  of  THEORIES 
AS  BUILDINGS,  suppose  that  someone  says 
“Mary  overhauled  the  theory  from  the  plumb¬ 
ing  to  the  chimney-stacks.”  This  surely  con¬ 
notes  that  Mary  very  thoroughly  overhauled  the 
theory.  We  can  get  this  connotation  without 
having  to  worry  at  all  about  what  features  of 
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theories  correspond  to  plumbing  and  chimney- 
stacks.  It  is  a  plausible  conjecture  that  those 
items  are  only  mentioned  by  the  speaker  to 
emphasize  that  the  overhaul  is  thorough.  So,  it 
is  simply  that  within-source  inferencing  is  need¬ 
ed  to  infer  that  the  physical  overhaul  is  thor¬ 
ough.  Assuming  then  that,  under  the  metaphor, 
physical  overhauling  maps  to  large-scale  mod¬ 
ification  of  the  theory,  and  that  thoroughness 
of  physical  overhaul  maps  to  thoroughness  of 
that  modification,  the  connotation  mentioned 
above  can  be  inferred. 

The  advocated  approach  refrains  from  as¬ 
suming  that  the  point  of  metaphor  is  for  pat¬ 
terns  of  inference  in  the  source  domain  to  be 
mapped  to  or  imposed  upon  the  target  domain, 
as  is  assumed  in  much  writing  on  metaphor 
(inch  Black.  1979;  Lakoff  &  Turner,  1989). 
This  is  not  to  say  that  such  mapping  or  impo¬ 
sition  cannot  be  done  or  should  not  ever  be 
done,  but  just  that  for  most  cases  of  under¬ 
standing  novel  manifestations  of  metaphors, 
it  is  not  necessary  (and  may  not  be  possible) 
to  map  inference  patterns  that  arc  not  already 
mapped  by  the  analogy  underlying  the  meta¬ 
phor.  (Of  course,  that  existing  analogy  will 
typically  map  some  inference  patterns  over  to 
the  target.)  The  claim  is  that  often,  or  even 
normally,  the  products  of  within-source  infer¬ 
ence  are  what’s  important,  not  their  pattern. 
But  it  is  certainly  possible  for  novel  manifes¬ 
tations  of  a  metaphor  to  require  unmapped 
source  entities  and  structure  to  be  mapped  to 
the  target.  For  instance,  if  someone  describes 
something  as  being  te  the  chimney-stack  of 
the  theory”  then  obviously  some  target  corre¬ 
spondence  for  chimney-stacks  must  be  found. 

The  rest  of  the  paper  is  mainly  about  how 
the  advocated  approach  is  fleshed  out  in  the 
ATT-Meta  system.  The  remaining  sections  of 
the  paper  are  as  follows:  a  section  on  the  main 
type  of  metaphorical  utterance  considered  in 
the  research;  a  section  very  briefly  sketching 
ATT-Meta’s  basic  reasoning  facilities,  irrespec¬ 
tive  of  metaphor;  a  section  describing  ATT- 
Meta’s  metaphorical  reasoning;  a  section  on 
various  types  of  uncertainty  handled  in  ATT- 
Meta’s  metaphorical  reasoning,  and  on  an  ob¬ 


servation  that  metaphor-based  inferences 
should  often  override  target  information;  and  a 
section  containing  final  remarks. 

METAPHORS  HANDLED 
BY  ATT-META 

The  ATT-Meta  system  docs  not  currently 
deal  with  novel  metaphors  —  rather,  it  has  pre¬ 
given  knowledge  of  a  specific  set  of  meta¬ 
phors,  including  MIND  AS  PHYSICAL 
SPACE  and  IDEAS  AS  PHYSICAL  OB¬ 
JECTS.  But  it  is  specifically  designed  to  han¬ 
dle  novel  manifestations  of  those  metaphors. 
Its  knowledge  of  a  metaphor  consists  mostly 
of  a  relatively  small  set  of  very  general  "con¬ 
version  rules”  that  map  between  the  source  and 
target  domains.  These  encapsulate  what  the 
system  knows  about  the  analogy  behind  the 
metaphor.  The  degree  of  novelty  that  the  sys¬ 
tem  can  handle  in  a  manifestation  of  a  meta¬ 
phor  is  limited  only  by  the  amount  of  knowl¬ 
edge  it  has  about  the  source  domain  and  by 
the  generality  of  the  conversion  rules. 

The  ATT-Meta  research  has  concentrated 
on  metaphors  for  mental  states,  although  the 
principles  and  algorithms  implemented  arc  not 
restricted  to  or  specialized  for  such  metaphors. 
Mundane  types  of  discourse,  such  as  ordinary' 
conversations  and  newspaper  articles,  often  use 
metaphor  in  talking  about  mental  states/pro¬ 
cesses  of  agents.  Indeed,  as  with  many  ab.stract 
topics,  as  soon  as  anything  at  all  subtle  or  com¬ 
plex  needs  to  be  said,  metaphor  is  practically 
essential.  There  arc  many  mental-state  meta¬ 
phors  apart  from  the  two  mentioned  already. 
Some  arc  as  follows:  COGNITION  AS  VI¬ 
SION,  as  when  understanding,  realization, 
knowledge,  etc.  is  cast  as  vision,  as  in  "His  view 
of  the  problem  was  blurred;”  IDEAS  AS  IN¬ 
TERNAL  UTTERANCES,  which  is  manifest¬ 
ed  when  a  person’s  thoughts  are  described  as 
internal  speech  or  writing  (internal  speech  is 
not  Uteraily  speech),  as  in  "He  said  to  himself 
that  he  ought  to  stay  at  home  and  work;”  and 
MIND  PARTS  AS  PERSONS,  under  which  a 
person’s  mind  is  cast  as  containing  several  sub¬ 
agents  with  their  own  thoughts,  emotions,  etc.. 
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as  in  “Part  of  him  was  convinced  that  he  should 
go  to  the  party.”  Many  real-discourse  examples 
of  mental-state  metaphor  can  be  found  in  a  da¬ 
tabank  at  http://www.cs.bhani.ac.uk/jab/ 
ATT-Meta/Databank. 

As  Well  as  being  able  to  reason  metaphori  - 
cally  about  agents’  beliefs  and  reasoning,  ATT- 
Meta  has  general,  non-metaphor-related  facili¬ 
ties  for  reasoning  about  agents’  beliefs  and  rea¬ 
soning.  These  facilities  are  beyond  the  scope 
of  the  present  paper,  but  are  described  in 
Bamden  (1998)  (see  also  Bamden,  to  appear, 
and  Bamden,  in  press;  and  see  Bamden  et  al 
1 994  for  an  early  version). 

ATT-META’S  BASIC  REASONING 

ATT-Meta  is  merely  a  reasoning  system, 
and  does  not  deal  with  natural  language  input 
directly.  Rather,  a  user  supplies  hand-coded 
logic  formulae  that  are  intended  toxjouch,  al¬ 
beit  simplistically,  the  source-based  meaning 
of  small  discourse  chunks  (typically  two  or 
three  sentences). 

ATT-Meta  is  rule-based,  and  manipulates 
hypotheses  (facts,  conclusions  or  goals),  repre¬ 
sented  as  expressions  in  a  situation-based  or 
episode-based  first-order  logic  somewhat  akin 
to  that  of  Hobbs  (1990).  At  any  time,  any  par¬ 
ticular  hypothesis  H  is  tagged  with  a  certainty 
level,  one  of  certain,  presumed,  suggested,  pos¬ 
sible  or  certainly-not.  The  last  just  means  that 
the  negation  of  H  is  certain.  Possible  just  means 
that  the  negation  of  H  is  not  certain  but  no  evi¬ 
dence  has  yet  been  found  for  H  itself.  Presumed 
means  that  H  is  a  default:  i.e.,  it  is  taken  as  a 
working  assumption,  pending  further  evidence. 
Suggested  means  that  there  is  evidence  for  the 
hypothesis,  but  it  is  not  (yet)  strong  enough  to 
enable  H  to  be  a  working  assumption. 

ATT-Meta  applies  its  rules  in  a  backchain- 
ing  style.  It  is  given  a  reasoning  goal,  and  uses 
rules  backwards  to  generate  supporting  sub¬ 
goals.  When  a  rule  application  supports  a  hy¬ 
pothesis,  it  supplies  a  level  of  certainty  to  the 
hypothesis,  calculated  as  the  minimum  of  the 
mle’s  own  certainty  level  and  the  levels  picked 
up  from  the  hypotheses  satisfying  the  rule’s 


condition  part.  When  several  rules  support  a 
hypothesis,  the  maximum  of  their  certainty 
contributions  is  taken. 

When  both  a  hypothesis  H  and  its  negation 
-H  are  supported  to  level  presumed,  conflict- 
resolution  takes  place.  The  system  attempts  to 
see  whether  one  hypothesis  has  more  specific 
evidence  than  the  other.  If  a  hypothesis  is  more 
specifically  supported  than  its  negation,  it  stays 
presumed  and  the  negation  is  downgraded  to 
suggested.  If  neither  hypothesis  wins,  both  are 
downgraded  to  suggested.  Under  certain  con¬ 
ditions,  one  way  for  a  hypothesis  to  be  more 
specifically  supported  than  its  negation  is  for  it 
to  be  supported  (directly  or  indirectly)  by  a 
proper  superset  of  the  facts  supporting  the  ne¬ 
gation.  Inter-derivability  relationships  between 
hypotheses  appearing  in  the  support  networks 
are  also  used  in  specificity  cornparison. 

This  paper  will  not  display  ATT-Meta’ s 
formal  representations  and  formal  rule  for¬ 
mats  (which  are  in  turn  represented  as  Quin¬ 
tus  Prolog  expressions),  and  will  use  English 
glosses  instead.  These  glosses  may  use  the 
past  tense  to  match  the  tense  of  English  ex¬ 
ample  sentences,  but  this  is  just  for  readabil¬ 
ity,  and  ATT-Meta  currently  has  ho  working 
treatment  of  time.  Detail  on  the  representa¬ 
tional  style  is  in  Bamden  et  al.  (1994),  and 
considerable  detail  on  ATT-Meta’ s  general 
reasoning  framework  (and  belief  reasoning) 
can  be  found  in’Barnden  (1998).  As  explained 
in  Barnden  (1998),  ATT-Meta’ s  algorithms 
for  metaphor-based  reasoning  are  almost 
identical  to  those  for  belief  reasoning. 

ATT-META’S  METAPHORICAL 
REASONING 

We  will  continue  to  consider  the  comers 
sentence,  and  to  assume  that  it  has  the  follow¬ 
ing  connotation: 

Connotation;  The  mentioned  ideas  (as 
mental  entities  of  John*  s)  did  not  inferentially 
interact. 

ATT-Meta’s  approach  to  deriving  such  a  con¬ 
notation  involves  source-based  pretence  (or  lit¬ 
eral  pretence  as  I  have  called  it  elsewhere).  A 
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source-meaning  representation  for  die  metaphor¬ 
ical  input  utterance  is  given  to  the  system,  and  the 
system  pretends  that  this  representation,  howev¬ 
er  ridiculous  it  is  in  reality,  is  true.  Within  the  con¬ 
text  of  this  pretence,  the  system  can  do  any  amount 
of  within-source  reasoning  (reasoning  that  arises 
from  its  knowledge  of  the  source  domains  of  the 
metaphors  involved)  using  the  source-based 
meaning.  In  our  example,  it  can  use  knowledge 
about  mundane  physical  objects,  rooms  and  cor¬ 
ners.  The  key  point  is  that  this  within-pretence 
reasoning  from  the  source-based  meaning  of  the 
utterance  link  up  with  the  analogical  mappings 
involved  in  die  metaphor.  In  the  present  case,  as 
explained  in  the  Introduction,  the  relevant  map¬ 
ping  is  that  from  (lack  of)  physical  interaction  to 
(lack  of)  inferential  interaction. 

That  mapping  is  itself  of  a  very  fundamen¬ 
tal,  general  nature,  and  does  not,  for  instance, 
rely  on  the  notion  of  comers  or  rooms.  Any  pro¬ 
cess  of  within-source  inference  that  linked  up 
with  physical  interaction  could  lead  to  conclu¬ 
sions  that  ideas  did  or  did  not  inferentially  in¬ 
teract.  There  is  no  need  at  all  for  ATT-Meta  to 
have  any  knowledge  of  how  comers  or  rooms 
match  to  aspects  of  the  mind. 

ATT-Meta  proceeds  as  follows  in  dealing 
with  the  hand-constructed  logical  input  corre¬ 
sponding  to  the  comers  sentence.  This  input  is 
paraphrased  as  source-based  premises  (LI )  to  (L6) 
below.  The  system  uses  a  computational  environ¬ 
ment  called  a  metaphorical  pretence  cocoon  to 
hold  those  premises  and  the  within-pretence  rea¬ 
soning.  The  following  shows  hypotheses  that  are 
placed  inside  and  outside  the  cocoon: _ 

Inside  the  Cocoon 

((LI ))  Ideal  is  in  comerl . 

((L2))  Idea2  is  in  comer2. 

((L3))  Comerl  is  a  comer  of  John’s  mind. 

((L4))  Comer2  is  a  comer  of  John’s  mind. 

(^5))  Comerl  and  Corner2  are  distinct. 

((L6))  John’s  mind  is  a  room. _ 

Outside  the  Cocoon 

((SL.i))  I  (the  system)  am  pretending  that 
(L.i)  holds. 

(for  i  from  1  to  6) _ 


Actually,  to  include  (L6)  is  an  over-sim¬ 
plification,  because  of  course  containers  other 
than  rooms  have  comers;  also,  that  John’s  mind 
is  presumably  a  container  can  follow  by  with¬ 
in-source  reasoning  from  the  fact  that  it  has 
comers,  so  there  is  no  real  need  to  include  some¬ 
thing  like  (L6)  from  the  start. 

Given  that  the  system  knows  that  a  room  is 
a  physical  space  (or,  more  precisely,  has  a  phys¬ 
ical  space  as  a  part),  it  can  infer  within  the  pre¬ 
tence  cocoon  that  John’s  mind  is  a  physical 
space.  It  can  also  infer  that  ideal  and  idea2  are 
presumably  physical  objects,  because  normal¬ 
ly  only  physical  objects  are  in  physical  comers. 
Thus,  it  is  at  this  point  that  the  system  in  es¬ 
sence  realizes  that  the  utterance  manifests  the 
metaphors  of  MIND  AS  PHYSICAL  SPACE 
and  IDEAS  AS  PHYSICAL  OBJECTS. 

As  usual,  the  system  is  given  a  reasoning 
goal,  say 

((Gl))  John  believes  P. 

Suppose  that  P  is  some  proposition  that  can 
be  inferred  from  the  ideas  mentioned  in  the  ut¬ 
terance  (ideal  and  idea2).  By  a  process  ex¬ 
plained  in  Barnden  (1998),  (Gl)  leads  to  the 
task  of  seeing  whether 

((G2))  ideal  and  idea2 
inferentially  interact. 

(cf.  the  Connotation  above).  Now,  the  analogi¬ 
cal  mapping  between  physical  interaction  and 
inferential  interaction  appears  in  the  system  as 
a  small  collection  of  “conversion”  rules,  con¬ 
verting  between  metaphorical  and  non-meta- 
phorical  terms.  One  such  rule  can  be  para¬ 
phrased  as 

Conversion  Rule  CONV 

IF  I  (the  system)  am  pretending  that  ideas 
J  and  K  of  agent  X  are  physical  objects 

AND  I  am  pretending  that  J  and  K  do  not 
physically  interact 

THEN  [presumed]  J  and  K  do  not  infer- 
cntially  interact. 

The  presumed  is  the  rule’s  own  certainty 
qualifier,  and  serves  as  an  upper  bound  on  the 
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certainty  that  can  be  attached  to  the  rule’s  con- 
elusion  by  virtue  of  an  application  of  the  rule. 
In  backwards  application  to  the  negation  of 
(G2),  which  is  investigated  along  with  (G2)  it¬ 
self,  CONV  leads  to  the  creation  of  the  subgoal 

((G3))  I  (the  system)  am  pretending  that  J 
and  K  do  not  physically  interact. 

All  the  goals  so  far  mentioned  are  outside 
the  metaphorical  pretence  cocoon,  but  (G3)  is 
automatically  accompanied  by  the  subgoal 

((G4)  J  and  K  do  not  physically  interact 

within  the  cocoon.  Now,  as  part  of  the  system’s 
knowledge  about  physical  objects  and  space, 
there  is  the  rule: 

IF  two  physical  objects  are  physically 
separated 

THEN  [presumed]  they  do  not  physically 
interact. 

Within  the  cocoon  the  system  therefore 
gets  the  reasoning  subgoal  that  J  and  K  are 
physically  separated.  With  the  aid  of  rules 
about  comers,  things  in  comers,  and  separa¬ 
tion,  the  system  can  establish  this  subgoal  to 
certainty  level  presumed  on  the  basis  of  the 
facts  (LI)  to  (L5)  in  the  cocoon.  (These  facts 
are  certain.)  As  a  result,  (G4),  (G3),  (G2)  and 
(Gl)  attain  level  presumed.  Notice  carefully 
that  the  inferencing  supporting  (G4)  is  entire¬ 
ly  “within-source”:  it  is  merely  uses  common- 
sense  knowledge  about  mundane  physical  ob¬ 
jects  and  physical  space. 

A  hypoAesis  like  “I  (the  system)  am  pre¬ 
tending  that  P”  is  called  a  pretence  hypothesis. 
For  each  such  formula  outside  the  cocoon,  P 
appears  inside  the  cocoon,  and  conversely.  The 
hypotheses  within  the  cocoon  are  noted  as  be¬ 
ing  within  the  cocoon  by  being  tagged  with  the 
system’s  name  for  the  cocoon.  Such  tags  are 
passed  around  by  reasoning  rules,  so  that  rule 
applications  on  hypotheses  within  the  cocoon 
lead  only  to  within-cocoon  hypotheses.  But  the 
tags  do  not  otherwise  affect  rule  application. 
Thus,  application  of  a  rule  within  a  cocoon  is 
virtually  identical  to  application  outside  the 
cocoon.  And,  currently,  all  rules  available  for 


the  system’s  reasoning  outside  cocoons  can  also 
be  used  within  cocoons. 

UNCERTAINTY  IN  METAPHOR 

Because  reasoning  within  a  cocoon  uses  the 
same  algorithms  as  that  outside,  uncertainty  is 
handled  within  a  pretence  cocoon  just  as  it  is 
outside.  Partly  as  a  result  of  this,  ATT-Meta 
includes  the  following  three  types  of  uncertainty 
handling  in  its  metaphor-based  reasoning. 
(Ul)  Given  an  utterance,  it  is  often  not  certain 
what  particular  metaphors  or  variants  of 
them  are  manifested.  Correspondingly, 
ATT-Meta  may  merely  have  presumed, 
for  instance,  as  a  tentative  level  of  cer¬ 
tainty  for  pretence  premises  like  the  (SL.i) 
above  (even  though  the  L.i  themselves  are 
certain).  This  hypothesis  is  then  potential¬ 
ly  subject  to  defeat. 

(U2)  Conversion  mles  like  CONV  are  merely 
default  rules  (i.e.  their  strength  is  pre¬ 
sumed).  There  can  be  evidence  against  the 
conclusion  of  the  rule.  Whether  the  con¬ 
clusion  survives  as  a  default  (presumed) 
hypothesis  depends  on  the  relative  speci¬ 
ficity  of  the  evidence  for  and  against  the 
conclusion.  Thus,  whether  apiece  of  met¬ 
aphorical  reasoning  overrides  or  or  is 
overridden  by  other  lines  of  reasoning 
about  the  target  is  matter  of  the  peculiar¬ 
ities  of  the  case  at  hand.  It  is  incorrect  to 
think  that  the  target  to  always  have  the 
upper  hand,  because  its  own  information 
may  itself  be  uncertain.  It  must  be  real¬ 
ized  that,  just  as  with  non-metaphorical 
utterances,  a  metaphorical  utterance  can 
express  an  exception  to  some  situation 
that  would  normally  apply  in  the  target 
domain.  To  say  “The  company  nursed  its 
competitor  back  to  health”  contradicts 
default  knowledge  that  companies  do  not 
normally  help  their  competitors,  and 
should  override  that  knowledge, 

(U3)  Knowledge  about  the  source  domain  of 
the  metaphor  is  itself  generally  uncertain. 
Correspondingly,  in  ATT-Meta  the  hy¬ 
potheses  and  reasoning  within  the  cocoon 
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.are  usually  uncertain.  For  instance,  it  is 
not  certain  that  physical  objects  do  not  in¬ 
teract  because  they  are  physically  sepa¬ 
rated,  a  default  we  used  in  the  comers 
example.  Thus,  conversion  rules  map 
within-cocoon  information  that  is  usual¬ 
ly  already  uncertain,  so  for  this  reason 
alone  their  results  are  generally  uncertain. 

Because  there  is  uncertain  reasoning  both 
within  and  outside  the  cocoon,  special  compli¬ 
cations  arise  for  conflict  resolution.  A  particu¬ 
lar  complication  is  that  the  pretence  cocoon  can 
contain  as  a  fact  any  fact  sitting  outside.  This 
importation  of  facts  is  needed  because  arbitrary 
information  about,  say,  physical  objects  may 
be  needed  in  a  pretence  cocoon  used  for  a  met¬ 
aphor  like  IDEAS  AS  PHYSICAL  OBJECTS. 
Also,  non-source-domain  rules  can  be  used 
within  the  cocoon.  But  the  imported  facts  and 
the  non-source  rules  may  support  something 
that  conflicts  with  conclusions  drawn  from  the 
special  metaphorical  facts  inserted  into  the  co¬ 
coon  at  the  start  (like  the  L.i  facts  in  the  cor¬ 
ners  example).  However,  the  system  adopts  the 
heuristic  that  metaphorical  facts  like  the  L.i 
supply  added  specificity.  Therefore,  ATT-Meta 
proceeds  as  follows:  within  a  metaphorical  pre¬ 
tence  cocoon,  specificity-comparison  is  first 
attempted  in  a  mode  where  all  reasoning  lines 
partially  dependent  on  imported  facts  are 
thrown  away.  Only  if  this  does  not  yield  a  win¬ 
ner  are  those  lines  restored,  and  specificity  re¬ 
assessed.  This  means  that  imported  facts  are 
downplayed  in  their  effects. 

Because  of  the  multiple  environments  in 
which  reasoning  is  done  (namely:  the  system’s 
top-level  environment;  pretence  cocoons;  sim¬ 
ilar  environments  used  for  simulating  agents 
reasoning)  and  because  pretence  cocoons  and 
agent- reasoning  simulation  environments  can 
be  nested  within  each  other  to  arbitrary  depth, 
conflict-resolution  in  ATT-Meta  is  a  complex 
matter  that  has  to  proceed  down  through  layers 
of  nesting  in  an  appropriate  way.  This  matter  is 
addressed  in  Barnden  (1998)  in  the  case  of  rea¬ 
soning-simulation  environments,  but  the  same 
process  also  applies  when  pretence  cocoons  arc 
thrown  in  (except  for  the  added  specificity  pro¬ 


vision  in  the  previous  paragraph,  and  a  reflec¬ 
tion  of  it  into  outer  layers). 

nNAI.  REMARKS 

ATT-Meta’s  metaphorical  pretence  pro¬ 
cessing  appears  to  provide  a  partial  implemen¬ 
tation  of  the  ’’conceptual  blending”  notion  of 
Turner  &  Fauconnicr  ( 1 995).  Metaphorical  pre¬ 
tence  cocoons  can  contain  a  mixture  of  pre¬ 
tence-based  and  non-prctencc-based  reasoning, 
because  of  the  fact  importation  mentioned  in  a 
previous  section,  and  because  non-source-do¬ 
main  rules  can  be  used  within  a  cocoon. 

Some  psychological  research  suggests  the 
people  may  not  construct  source-based  mean¬ 
ings  for  metaphorical  utterances,  at  least  if  the 
utterances  are  in  an  appropriate  context  and  are 
of  a  familiar  nature.  Although  ATT-Meta  is  not 
meant  to  be  a  psychological  model,  it  is  worth 
noting  that  its  use  of  source-based  meanings 
does  not  conflict  with  the  psychological  re¬ 
search.  This  is  explained  in  Barnden  (to  appear). 

A  near-future  topic  for  research  on  ATT- 
Meta  is  mixed  metaphor.  We  have  seen  the 
mixing  of  MIND  AS  PHYSICAL  SPACE  and 
IDEAS  AS  PHYSICAL  OBJECTS  in  this  pa¬ 
per,  but  more  conflictful  mixing  is  of  inter¬ 
est.  Also,  this  mixing  is  ’‘parallel”  in  that  a 
single  target  is  directly  illuminated  by  two 
metaphors.  ’’Serial”  mixing  (i.c.  chaining), 
when  A  is  viewed  as  B  and  B  is  viewed  as  C, 
can  be  handled  in  ATT-Meta  by  nesting  a 
pretence  cocoon  for  B-as-C  inside  one  for  A- 
as-B.  Not  much  experiment  has  yet  been  done 
on  this  with  ATT-Meta. 

ACKNOWI.EDGMENT 

The  research  was  supported  in  part  by  grant 
numberIRI-9101354  from  the  National  Science 
Foundation  (USA). 

REFERENCES 

Barnden,  J.  A.  (1998).  Uncertain  reasoning 
about  agents*  beliefs  and  reasoning. 
Technical  Report  CSRP-98-1 1,  School 


298 


Concerning  the  Role  of  Analogy  in  Metaphor  Processing 


of  Computer  Science,  The  Uni  ver^it/of  ■ 
Birmingham,  U.  K.  Invited  submission 
to  Artificial  Intelligence  and  Law. 

Barnden,  J.  A.  (in  press).  An  AI  system  for 
metaphorical  reasoning  about  mental 
states  in  discourse.  In  Koenig,  J-P, 
(Ed.),  Discourse  and  Cognition: 
Bridging  the  Gap.  Cambridge  Univer¬ 
sity  Press. 

Barnden,  J.  A.  (to  appear).  Combining  uncer¬ 
tain  belief  reasoning  and  uncertain  met¬ 
aphor-based  reasoning.  To  appear  in 
Procs.  Twentieth  Annual  Meeting  of  the 
Cognitive  Science  Society,  University  of 
Wisconsin-Madison,  August  1-4, 1998. 

Barnden,  J.  A.,  Helmreich,  S.,  Iverson,  E.  & 
Stein,  G.  C.  (1994).  An  integrated  im¬ 
plementation  of  simulative,  uncertain 
and  metaphorical  reasoning  about  men¬ 
tal  states.  In  J.  Doyle,  E.  Sandewall  &  P. 
Torasso  (Eds),  Principles  of  Knowledge 
Representation  and  Reasoning:  Pro¬ 
ceedings  of  the  Fourth  International 
Conference.  Morgan  Kaufmann. 

Black,  M.  (1979).  More  about  metaphor.  In 
A.  Ortony  (Ed.),  Metaphor  and 
Thought,  pp.  19-43.  Cambridge  Univer¬ 
sity  Press. 


Davidson,  D.  (1979).  What  metaphors  mean. 
In  S.  Sacks  (Ed.),  On  Metaphor.  Univer¬ 
sity  of  Chicago  Press. 

Grady,  J.  E.  (1997).  THEORIES  ARE  BUILD¬ 
INGS  revisited.  Cognitive  Linguistics, 
8(4),  pp.267-290. 

Hobbs,  J.  R.  (1990).  Literature  and  cognition. 
CSLI  Lecture  Notes,  No.  2 1 ,  Center  for 
the  Study  of  Language  and  Information, 
Stanford  University. 

Lakoff,  G.  (1993).  The  contemporary  theory  of 
metaphor.  In  A.  Ortony  (Ed.),  Metaphor 
and  Thought,  2nd  edition.  Cambridge 
University  Press. 

Lakoff,  G.  &  Turner,  M.  ( 1 989).  More  than  cool 
reason:  afield  guide  to  poetic  metaphor. 
U.  Chicago  Press. 

Martin,  J.  H.  (1990).  A  corhpiitational  model  of 
metaphor  interpretation.  Academic  Press. 

Turner,  M.  &  Faucorinier,  G.  (1995).  Concep¬ 
tual  integration  and  formal  expression. 
Metaphor  and  Symbolic  A  ctivity,  JO  (3), 
pp.183-204. 

Veale,  T.  &  Keane,  M.  T.  (1997).  The  compe¬ 
tence  of  sub-optimal  structure!  mapping 
on  ‘hard’  analogies.  In  Procs.  Int.  Joint 
Conf.  On  Artificial  Intelligence  (l^agdya, 
Japan),  August  1997. 


ALIGNMENT  AND  ABSTRACTION  IN  METAPHOR 

Brian  F.  Bowdle 

Department  of  Psychology 
Indiana  University 
Bloomington,  IN  47405 
bbowdle@indiana.edu 


INTRODUCTION 

Metaphors  establish  mappings  between 
concepts  from  disparate  domains  of  knowledge. 
For  example,  in  Ae  metaphor  The  mind  is  a 
computer,  an  abstract  entity  is  described  in 
terms  of  a  complex  electronic  device.  It  is  wide¬ 
ly  believed  that  metaphors  are  a  major  source 
of  knowledge  change,  and  a  great  deal  of  re¬ 
search  has  examined  how  metaphors  can  en¬ 
rich  and  illuminate  concepts  that  would  other¬ 
wise  remain  vague  or  ambiguous.  However, 
there  have  been  far  fewer  attempts  to  explain  a 
second  generative  function  of  metaphors  - 
namely,  lexical  extension.  In  this  paper,  I  will 
discuss  (1)  how  metaphoric  mappings  create 
new  word  meanings,  and  (2)  how  these  new 
meanings  are  applied  in  subsequent  metaphor 
processing.  Before  turning  to  these  issues,  how¬ 
ever,  it  is  necessary  to  consider  the  nature  of 
metaphoric  mappings  in  greater  depth. 

METAPHOR  AND  ANALOGY 

Metaphors  are  traditionally  viewed  as 
comparisons  between  the  target  (a-term)  and 
the  base  (b-term).  According  to  several  recent 
versions  of  this  view,  metaphors  act  to  set  up 
correspondences  between  isomorphic  concep¬ 
tual  structures  (e.g.,  Carbonell,  1981 ;  Centner, 
1983;  Centner,  Falkenhainer,  &  Skorstad, 
1988;  Indurkhya,  1987;  Verbrugge  8l  McCar- 
rell,  1977).  In  other  words,  metaphor  can  be 
seen  as  a  species  of  analogy. 

Centner’s  {\9%y) structure-mapping  the¬ 
ory  is  among  the  most  clearly  articulated  and 
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extensively  studied  of  these  approaches  to 
metaphor  comprehension.  Structure-mapping 
theory  assumes  that  the  act  of  comparison  in¬ 
volves  two  stages:  alignment  and  projection. 
The  alignment  process  operates  to  create  a 
maximal  structurally  consistent  match  be¬ 
tween  two  representations  that  observes  one- 
to-one  mapping  and  paraliel  connectivity 
(Falkenhainer,  Forbus,  &  Centner,  1989). 
That  is,  each  clement  of  one  representation 
can  be  placed  in  correspondence  with  at  most 
one  clement  of  the  other  representation,  and 
arguments  of  aligned  relations  are  themselves 
aligned.  A  final  constraint  on  the  alignment 
process  is  systematicity:  Alignments  that 
form  deeply  interconnected  structures,  in 
which  higher-order  relations  constrain  low- 
cr-order  relations,  are  preferred  over  less  sys¬ 
tematic  sets  of  commonalities'.  Once  a  struc¬ 
turally  consistent  match  between  the  target 
and  base  domains  has  been  found,  further 
predicates  from  the  base  that  are  connected 
to  the  common  system  can  be  projected  to 
the  target  as  candidate  inferences. 

To  illustrate  these  processes,  consider  the 
metaphor  Men  are  waives.  Given  the  simple 
target  and  base  representations  shown  in  Fig¬ 
ure  1,  structure-mapping  theory  predicts  the 
following  sequence  of  events  in  interpreting 
the  metaphor.  First,  the  relation  prey  on,  which 
is  shared  by  the  target  and  base,  is  aligned. 
Next,  the  arguments  of  the  relation  are  aligned 
by  parallel  connectivity:  wolves  h  men  and  an¬ 
imals  h  women.  Finally,  predicates  that  are 
unique  to  the  base  but  connected  to  the  aligned 
structure  (i.c.,  those  predicates  specifying  that 
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the  predatory  behavior  is  instinctive)  are  car¬ 
ried  over  to  the  target.  Thus,  the  metaphor 
would  be  interpreted  as  meaning  something 
like,  “Men  instinctively  prey  on  women.” 

In  many  metaphors  (as  in  analogies),  the 
focus  is  on  relational  commonalities,  and  cor¬ 
responding  objects  in  the  target  and  base  need 
not  be  similar.  Thus,  in  the  above  example,  the 
alignment  of  the  target  men  and  the  base  wolves 
was  determined  primarily  by  the  matching  re¬ 
lation  prey  on.  However,  the  way  in  which  men 
prey  on  women  is  different  from  the  way  in 
which  wolves  prey  on  animals.  This  situation, 
in  which  matching  predicates  contain  domain- 
specific  differences,  is  typical  of  metaphors 
(e.g.,  Ortony,  1979;  Tourangeau  &  Sternberg, 
1981).  Metaphoric  mappings  may  therefore  re¬ 
quire  re  representation  in  one  or  both  terms.  In 
particular,  domain-specific  features  of  match¬ 
ing  predicates  may  be  omitted  so  that  the  com¬ 
mon  structure  is  made  more  obvious  (see  Clem¬ 
ent,  Mawby,  &  Giles,  1994,  for  a  review  of  this 
and  other  modes  of  rerepresentation). 

METAPHOR  AND  POLYSEMY 


Like  analogies,  metaphors  lend  additional 
structure  to  problematic  target  concepts,  there¬ 
by  making  these  concepts  more  coherent.  How¬ 
ever,  this  is  not  the  only  way  in  which  meta¬ 
phors  can  lead  to  knowledge  change.  Metaphors 
are  also  a  primary  source  of  polysemy  -  they 
allow  words  with  specific  meanings  to  take  on 
additional,  related  meanings  (e.g,  Lakoff,  1987; 
Lehrer,  1990;  Miller,  1979;  Nunberg,  1979; 
Sweetser,  1990).  For  example,  consider  the 
word  roadblock.  There  was  presumably  a  time 
when  this  word  referred  only  to  a  barricade  set 
up  in  a  road.  With  repeated  metaphoric  use, 
however,  roadblock  has  acquired  the  second¬ 
ary  sense  “anything  that  blocks  progress”  (as 
in  Fear  is  a  roadblock  to  success). 

How  do  metaphors  create  new  word  mean¬ 
ings?  One  recent  and  influential  proposal  is  that 
such  lexical  extensions  are  due  to  stable  pro¬ 
jections  of  conceptual  structures  and  corre¬ 
sponding  vocabulary  items  from  one  (typically 
concrete)  domain  of  experience  to  another  (typ¬ 
ically  abstract)  domain  of  experience  (e.g.,  La¬ 
koff,  1987;  Lehrer,  1990;  Sweetser,  1990).  On 
this  view,  the  metaphoric  meaning  of  a  polyse- 
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Figure  1.  A  structure-mapping  interpretation  of  the  metaphor  Men  are  wolves. 
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mous  word  is  understood  directly  in  terms  of 
its  literal  meaning. 

I  wish  to  consider  an  alternative  account 
of  the  relationship  between  metaphor  and  pol¬ 
ysemy  "  one  that  is  based  on  the  analogical 
approach  to  metaphor  comprehension.  The  cen¬ 
tral  idea  is  that  structural  alignment  allows  for 
the  creation  of  abstract  metaphoric  categories, 
which  may  in  turn  be  lexicalized  as  secondary 
senses  of  metaphor  base  terms  (Bowdle,  1 998; 
Bowdle  &  Centner,  1995,  in  preparation;  Cen¬ 
tner  &  Wolff,  1997). 

The  Induction  of  Metaphoric  Categories 

When  a  novel  metaphor  is  first  encountered, 
both  the  target  and  base  terms  refer  to  domain- 
specific  concepts,  and  the  metaphor  is  interpret¬ 
ed  by  (1)  aligning  the  two  representations,  and 
(2)  importing  predicates  from  the  base  to  the  tar¬ 
get,  which  then  count  as  further  matches.  As  a 
result  of  this  comparison  process,  the  common 
relational  structure  will  increase  in  salience  rel¬ 
ative  to  domain-specific  differences  between  the 
two  representations.  This  highlighted  system 
may  in  turn  give  rise  to  an  abstract  metaphoric 
category  of  which  the  target  and  base  can  be  seen 
as  instances.  This  is  akin  to  the  induction  of  do¬ 
main-general  problem  schemas  during  the  course 
of  analogical  problem  solving  (e.g.,  Gick  &  Ho- 
lyoak,  1983;  Novick  &  Holyoak,  1991 ;  Ross  & 
Kennedy,  1990). 

On  this  view,  metaphoric  categories  are 
created  as  a  byproduct  of  the  comparison  pro¬ 
cess,  and  may  be  stored  separately  from  the 
original  target  and  base  concepts.  However,  if 
a  given  metaphor  base  is  repeatedly  aligned  with 
different  targets  so  as  to  yield  the  same  basic 
interpretation,  the  abstraction  will  become  con¬ 
ventionally  associated  with  the  base  term.  At 
this  point,  the  base  term  will  be  polysemous, 
having  both  a  domain -specific  meaning  and  a 
related  domain-general  meaning. 

Of  course,  not  just  any  metaphor  can  lead 
to  lexical  extension.  Rather,  the  alignment  of 
the  target  and  base  concepts  must  be  able  to 
suggest  a  coherent  category.  Mappings  that  fo¬ 
cus  on  relational  structures  are  therefore  more 


likely  to  generate  stable  abstractions  than  map¬ 
pings  that  focus  on  less  systematic  object  de¬ 
scriptions  (sec  also  Ramscar  &  Pain,  1996; 
Shen,  1992).  For  example,  the  metaphor  The 
sun  is  a  tangerine  elicits  two  common  attributes 
of  the  target  and  base:  Both  arc  round,  and  both 
are  orange  in  color.  However,  these  two  at¬ 
tributes  are  not  systematically  related.  The 
metaphor  is  therefore  unlikely  to  suggest  a  cat¬ 
egory  of  things  that  are  round  and  orange  in 
color,  and  it  will  not  lead  to  lexical  extension 
of  the  base  term  tangerine. 

The  Career  of  Metaphor 

One  of  the  key  issues  in  metaphor  research 
concerns  how  best  to  characterize  differences 
between  novel,  conventional,  and  dead  meta¬ 
phors.  The  present  account  of  the  relationship 
between  metaphor  and  polysemy  suggests  a 
representational  distinction  between  these  types 
of  metaphors.  The  basic  idea  is  that  the  con¬ 
ventionality  of  a  metaphor  is  determined  by  ( I ) 
whether  or  not  the  base  term  evokes  a  meta¬ 
phoric  category,  and  (2)  how  this  abstraction  is 
related  to  the  literal  base  concept. 

The  evolution  from  novel  to  dead  meta¬ 
phors  is  summarized  in  Figure  2.  Novel  meta¬ 
phors  involve  base  terms  that  refer  to  a  domain - 
specific  concept,  but  arc  not  (yet)  associated 
with  a  domain-general  category.  For  example, 
the  novel  base  term  glacier  (as  in  Science  is  a 
glacier)  has  a  literal  sense  -  “a  large  body  of 
ice  spreading  outward  over  a  land  surface”  - 
but  no  related  metaphoric  sense  (e.g.,  ‘‘anything 
that  progresses  slowly  but  steadily”). 

In  contrast,  conventional  metaphors  in¬ 
volve  base  terms  that  refer  both  to  a  literal 
concept  and  to  an  associated  metaphoric  cat¬ 
egory.  For  example,  the  conventional  base 
term  blueprint  (as  in  4  gene  is  a  blueprint) 
has  two  closely  related  senses:  “a  blue  and 
white  photographic  print  in  showing  an  ar¬ 
chitect’s  plan”  and  “anything  that  provides  a 
plan.”  Conventional  base  terms  arc  polyse¬ 
mous,  and  the  literal  and  metaphoric  mean¬ 
ings  are  semantically  linked  due  to  their  ob¬ 
vious  similarity. 
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The  ultimate  conclusion  of  the  career  of 
metaphor  occurs  when  the  relationship  between 
the  derived  metaphoric  category  and  the  origi¬ 
nal  base  concept  is  no  longer  recognized.  At 
this  stage,  any  expression  using  the  metaphoric 
sense  of  the  base  term  is  a  dead  metaphor,  and 
will  not  seem  metaphoric.  Figure  2  shows  two 
possible  types  of  dead  metaphors.  Deadj  meta¬ 
phors  are  similar  to  conventional  metaphors, 
except  that  the  two  representations  evoked  by 
the  base  term  are  no  longer  semantically  linked. 
That  is,  dead,  base  terms  are  homonymous  rath¬ 
er  than  polysemous.  For  example,  consider  the 
statement  A  university  is  a  culture  of  knowl¬ 
edge.  Here,  the  word  culture  refers  to  a  partic¬ 


ular  heritage  or  society,  and  its  use  seems  quite 
literal.  In  fact,  this  sense  of  culture  is  a  meta¬ 
phoric  extension  of  another  commonly-known 
sense  of  the  word:  “a  preparation  for  growth” 
(as  in  the  culture  of  the  vine  or  bacteria  cul¬ 
ture).  However,  these  two  meanings  no  longer 
seem  related.  This  is  perhaps  because  the  once- 
abstract  metaphoric  category  has,  through  re¬ 
peated  application  to  the  domain  of  human  af¬ 
fairs,  acquired  new  domain-specific  features. 

Finally,  dead ^  metaphors  involve  base 
terms  that  refer  only  to  a  derived  metaphoric 
category  -  the  original  base  concept  no  longer 
exists.  An  example  of  this  is  the  dead^  base  term 
blockbuster  {as  in  The  movie  ^Titanic*’  was  a 
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blockbuster),  which  means  “anything  that  is 
highly  effective  or  successful.”  However,  most 
people  are  unaware  that  this  word  originally 
referred  to  a  very  large  bomb  that  could  demol¬ 
ish  an  entire  city  block. 

PROCESSING  IMPLICATIONS 

Thus  far,  I  have  discussed  how  abstract  met¬ 
aphoric  categories  are  created,  and  how  these 
categories  may  be  lexicalized  as  secondary  sens¬ 
es  of  metaphor  base  terms.  Not  only  does  this 
account  offer  a  means  of  distinguishing  between 
novel,  conventional,  and  dead  metaphors,  but  it 
also  has  clear  implications  for  the  effects  of  con¬ 
ventionality  on  metaphor  processing. 

Consider  again  the  career  of  metaphor  sum¬ 
marized  in  Figure  2.  In  novel  metaphors,  both 
the  target  and  base  terms  refer  to  domain-spe¬ 
cific  concepts  at  roughly  the  same  level  of  ab¬ 
straction.  Novel  metaphors  will  therefore  be  in¬ 
terpreted  as  comparisons,  in  which  the  target  is 
structurally  aligned  with  the  base.  In  convention¬ 
al  metaphors,  however,  the  base  term  is  polyse- 
mous  -  it  refers  both  to  a  domain-specific  con¬ 
cept  and  to  a  related  domain-general  category. 
Conventional  metaphors  may  therefore  be  inter¬ 
preted  either  as  comparisons,  by  aligning  the  tar¬ 
get  concept  with  the  literal  base  concept,  or  as 
categorizations,  by  aligning  the  target  concept 
with  the  metaphoric  category  named  by  the  base 
term.  Finally,  in  dead  metaphors,  only  the  meta¬ 
phoric  category  named  by  the  base  will  be  ap¬ 
plied  to  the  target  -  the  original  base  concept 
cither  seems  irrelevant  (dead,  metaphors)  or  is 
no  longer  available  (dead^  metaphors). 

Thus,  as  metaphors  become  increasingly 
conventional,  there  is  a  shift  in  mode  of  pro¬ 
cessing  from  comparison  to  categorization 
(Bowdle,  1998;  Bowdle  &  Gentner,  1995,  in 
preparation;  Gentner  &  Wolff,  1997).  This  is 
consistent  with  a  number  of  recent  proposals, 
according  to  which  the  interpretation  of  novel 
metaphors  involves  sense  creation,  but  the  in¬ 
terpretation  of  conventional  metaphors  involves 
sense  retrieval  (e.g..  Blank,  1988;  Blasko  & 
Connine,  1993;  Giora,  1997;  Turner  &  Katz, 
1997).  On  the  present  view,  the  senses  retrieved 


during  conventional  metaphor  comprehension 
arc  abstract  metaphoric  categories.  1 

Experimental  Evidence 

To  gain  direct  evidence  for  the  processing 
shift  predicted  by  the  career  of  metaphor,  De- 
dre  Gentner  and  I  have  recently  conducted  a 
series  of  experiments  comparing  the  compre¬ 
hension  and  evaluation  of  novel  and  conven¬ 
tional  figurative  statements  (Bowdle  &  Gent¬ 
ner,  1995,  in  preparation).  Central  to  the  logic 
of  these  experiments  was  the  distinction  be¬ 
tween  metaphors  and  similes. 

Nominal  metaphors  (figurative  statements 
of  the  form  X  is  Y)  can  often  be  paraphrased  as 
similes  (figurative  statements  of  the  form  X  is 
like  Y).  For  example,  one  can  say  both  The  mind 
is  a  computer  and  The  mind  is  like  a  computer. 

This  linguistic  alternation  is  interesting  because 
metaphors  are  grammatically  identical  to  liter¬ 
al  categorization  statements  (e.g.,  A  sparrow  is 
a  bird),  and  similes  are  grammatically  identi¬ 
cal  to  literal  comparison  statements  (e.g.,  A  < 
sparrow  is  like  a  robin).  Assuming  that  form 
typically  follows  function  in  both  literal  and 
figurative  language,  metaphors  and  similes  may  ^ 
tend  to  promote  different  comprehension  strat-  j 
cgies.  Specifically,  metaphors  should  invite  I 
classifying  the  target  as  a  member  of  a  catego-  'i 
ly  named  by  the  base,  whereas  similes  should  ^ 
invite  comparing  the  target  to  the  base.  This 
makes  the  metaphor-simile  distinction  a  valu¬ 
able  tool  for  examining  the  use  of  comparison 
and  categorization  during  figurative  language 
comprehension. 

Grammatical  Form  Preferences.  If  con¬ 
ventionalization  results  in  a  processing  shift  from 
comparison  to  categorization,  then  there  should 
be  a  corresponding  shift  at  the  linguistic  level 
from  the  comparison  (simile)  form  to  the  cate¬ 
gorization  (metaphor)  form.  We  gave  subjects 
novel  and  conventional  figurative  statements  in 
both  grammatical  forms,  and  asked  which  form 
they  preferred  for  each  statement.  Subjects  were 
also  given  statements  in  which  the  target  was 
literally  similar  to  the  base  (e.g.,  lemon  h  orange) 

-for  which  the  comparison  form  is  most  natural 
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“  and  statements  in  which  the  target  was  a  mem¬ 
ber  of  a  literal  category  named  by  the  base  (e.g., 
whale  h  mammal)  -  for  which  the  categorization 
form  is  most  natural. 

As  predicted,  subjects  preferred  similes 
more  strongly  for  novel  than  for  conventional 
figurative  statements.  Indeed,  the  preference  for 
the  comparison  form  was  as  great  for  novel  fig- 
uratives  as  for  statements  in  which  the  target 
and  base  were  literally  similar.  However,  sub¬ 
jects  showed  no  strong  preference  for  express¬ 
ing  conventional  figurative  statements  as  simi¬ 
les  or  as  metaphors.  This  is  consistent  with  the 
claim  that,  because  conventional  base  term  re¬ 
fer  both  to  a  literal  concept  and  to  a  related 
metaphoric  category,  conventional  figuratives 
may  be  interpreted  either  as  comparisons  or  as 
categorizations. 

Comprehension  Times.  The  career  of  met¬ 
aphor  also  makes  clear  predictions  about  the  on¬ 
line  comprehension  of  novel  and  conventional 
figurative  statements.  One  prediction  is  that,  if 
conventionalization  results  in  a  processing  shift 
from  comparison  to  categorization,  then  conven¬ 
tional  figuratives  should  be  easier  to  interpret 
than  novel  figuratives.  Because  metaphoric  cat¬ 
egories  will  be  informationally  sparser  than  the 
literal  concepts  they  were  derived  from,  map¬ 
pings  between  a  target  and  a  metaphoric  catego¬ 
ry  will  be  computationally  less  costly  than  map¬ 
pings  between  a  target  and  a  literal  base  con¬ 
cept.  In  fact,  previous  studies  have  confirmed 
that  conventional  metaphors  are  comprehended 
more  rapidly  than  novel  metaphors  (e.g..  Blank, 
1988;  Blasko  &  Connine,  1993). 

A  second  and  more  interesting  prediction 
concerns  the  effects  of  conventionality  on  the 
relative  comprehension  times  of  metaphors  and 
similes.  If  novel  figurative  statements  are  inter¬ 
preted  strictly  as  comparisons,  then  novel  simi¬ 
les  should  be  easier  to  comprehend  than  novel 
metaphors.  This  is  because  only  the  simile  form 
directly  invites  comparison.  At  the  same  time,  if 
conventional  figurative  statements  can  be  inter¬ 
preted  either  as  comparisons  or  as  categoriza¬ 
tions,  then  conventional  metaphors  should  be 
easier  to  comprehend  than  conventional  simi¬ 
les.  The  metaphor  form  invites  categorization. 


and  will  therefore  promote  a  relatively  simple 
alignment  between  the  target  and  the  abstract 
metaphoric  category  named  by  the  base.  The 
simile  form  invites  comparison,  and  will  there¬ 
fore  promote  a  more  complex  alignment  between 
the  target  and  the  literal  base  concept. 

We  collected  subjects’  comprehension 
times  for  novel  and  conventional  figurative 
statements  phrased  either  as  metaphors  or  as 
similes.  The  results  were  as  predicted  by  the 
career  of  metaphor.  First,  conventional  figura¬ 
tive  statements  were  interpreted  faster  than 
novel  figurative  statements.  Second,  there  was 
an  interaction  between  conventionality  and 
grammatical  form:  Novel  similes  were  faster 
than  novel  metaphors,  but  conventional  meta¬ 
phors  were  faster  than  conventional  similes. 

Metaphoricity  Ratings.  What  makes  some 
mappings  seem  metaphoric  and  other  mappings 
seem  literal?  One  possibility  is  that  metaphoric¬ 
ity  is  due  to  rerepresentation,  in  which  distinct 
domain-specific  features  of  matching  predicates 
are  omitted  so  that  the  common  structure  is  made 
more  obvious  (Bowdle,  1998).  This  is  consis¬ 
tent  with  the  observation  that  metaphors  and  sim¬ 
iles  typically  involve  mappings  between  con¬ 
cepts  from  different  ontological  domains,  where¬ 
as  literal  comparisons  and  literal  categorizations 
typically  involve  mappings  between  concepts 
from  the  same  ontological  domain. 

This  view  of  the  relationship  between  re¬ 
representation  and  metaphoricity  suggests  a 
further  test  of  the  career  of  metaphor.  If  con¬ 
ventionalization  results  in  a  processing  shift 
from  comparison  to  categorization,  then  novel 
figurative  statements  should  seem  more  meta¬ 
phoric  than  conventional  figurative  statements. 
Because  the  predicates  of  literal  base  concepts 
will  be  more  domain-specific  than  those  of  ab¬ 
stract  metaphoric  categories,  they  will  require 
more  rerepresentatioin  when  matched  with  do- 
main-specific  predicates  in  a  target  concept. 

A  further  prediction  concerns  how  conven¬ 
tionality  affects  the  relative  metaphoricity  of 
metaphors  and  similes.  If  both  novel  metaphors 
and  novel  similes  are  interpreted  by  aligning 
the  target  with  the  same  literal  base  concept, 
then  both  grammatical  forms  should  seem 
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equally  metaphoric.  At  the  same  time,  if  con¬ 
ventional  metaphors  promote  aligning  the  tar¬ 
get  with  an  abstract  metaphoric  category,  but 
conventional  similes  promote  aligning  the  tar¬ 
get  with  a  literal  base  concept,  then  conven¬ 
tional  similes  should  seem  more  metaphoric 
than  conventional  metaphors  -  the  simile  will 
initiate  a  mapping  that  requires  a  greater  de¬ 
gree  of  rerepresentation.  Note  that  this  predic¬ 
tion  is  contrary  to  the  traditional  (and  previ¬ 
ously  untested)  assumption  that  metaphors  are 
more  metaphoric  than  similes. 

We  gave  subjects  novel  and  conventional 
figurative  statements  phrased  either  as  meta¬ 
phors  or  as  similes,  and  asked  them  to  rate  the 
metaphoricity  of  each  statement.  The  results 
were  as  predicted  by  the  career  of  metaphor. 
First,  novel  figurative  statements  were  rated  as 
more  metaphoric  than  conventional  figurative 
statements.  Second,  there  was  an  interaction 
between  conventionality  and  grammatical  form: 
Novel  metaphors  and  similes  were  equally 
metaphoric,  but  conventional  similes  were  more 
metaphoric  than  conventional  metaphors. 

CATEGORIZATION  MODELS  OF 
METAPHOR 

One  of  the  central  claims  made  in  this  pa¬ 
per  is  that  as  metaphors  are  conventionalized  ~ 
that  is,  as  they  increasingly  rely  on  the  applica¬ 
tion  of  stable  abstractions  -  there  is  a  shift  in 
mode  of  processing  from  comparison  to  cate¬ 
gorization.  However,  several  theorists  have  re¬ 
cently  argued  that  all  metaphors  are  essentially 
categorizations  (c.g.,  Glucksberg  &  Keysar, 
1990;  Glucksberg,  McGlone,  &  Manfredi, 
1997;  Honeck,  Kibler,  &  Firment,  1987; 
Kennedy,  1990).  On  this  view,  the  original  tar¬ 
get  and  base  concepts  of  a  novel  metaphor  are 
never  directly  aligned.  Rather,  the  metaphor  is 
interpreted  by  (1)  deriving  an  ab.stract  meta¬ 
phoric  category  from  the  base  concept  alone, 
and  (2)  applying  this  category  to  the  target. 

The  experimental  evidence  summarized 
above  casts  doubt  on  these  processing  claims. 
Novel  metaphors  appear  to  be  interpreted  strict¬ 
ly  as  comparisons,  in  which  the  target  is  struc¬ 


turally  aligned  with  the  base.  Although  novel 
metaphoric  mappings  may  create  abstract  met¬ 
aphoric  categories,  these  categories  initially 
arise  as  a  byproduct  of  the  comparison  process. 
Only  when  a  metaphoric  category  has  become 
Conventionally  associated  with  the  base  term 
of  a  metaphor  can  the  statement  be  interpreted 
as  a  categorization.  Assuming  that  the  meta¬ 
phor  is  not  dead,  however,  it  may  still  be  inter¬ 
preted  as  a  comparison  between  the  target  and 
the  original  base  concept. 

CONCLUSIONS 

By  viewing  metaphors  as  analogies,  two 
generative  functions  of  metaphors  can  be  ex¬ 
plained  -  namely,  the  structural  enhancement 
of  target  concepts,  and  the  lexical  extension  of 
base  terms.  In  this  paper,  I  have  focused  on  the 
latter  of  these  two  functions,  and  have  discussed 
the  relationship  between  polysemy  and  conven¬ 
tionality  in  metaphors.  The  career  of  metaphor 
outlined  here  offers  a  unified  approach  to  met¬ 
aphor  processing. 
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ABSTRACT 

These  experiments  evaluated  the  claim  that 
abstract  conceptual  domains  are  organized  and 
structured  on-line  as  metaphorical  mappings 
from  conceptual  domains  grounded  directly  in 
experience.  One  hypothesis  is  that  the  concep¬ 
tual  domain  of  time  is  systematically  organized 
in  terms  of  the  more  concrete  and  familiar  do¬ 
main  of  space.  I  focus  on  relational  similarities 
between  the  conceptual  domains  of  space  and 
time,  consider  a  number  of  explanations  of  how 
these  similarities  may  have  come  about,  and 
describe  a  set  of  experiments  designed  to  dis¬ 
tinguish  between  these  explanations.  The  results 
indicated  that  people  indeed  use  spatial  sche¬ 
mas  on-line  to  understand  and  organize  the  con¬ 
ceptual  domain  of  time.  These  results  provide 
some  of  the  first  empirical  evidence  for  meta¬ 
phoric  representation. 

INTRODUCTION 

One  of  the  burdens  of  providing  a  good  the- 
oiy  of  mental  representation  is  to  explain  how 
a  representational  store  as  heterogeneous,  so¬ 
phisticated,  and  abstract  as  the  human  concep- 
ticon  could  possibly  emerge  from  physical  ex¬ 
perience  with  the  world.  One  solution  proposed 
by  Lakoff  and  Johnson  (1980)  argues  that  our 
conceptual  system  is  structured  around  a  small 
set  of  experiential  concepts  (concepts  that 
emerge  directly  out  of  experience  and  are  de¬ 
fined  in  their  own  terms).  These  fundamental 
experiential  concepts  include  a  set  of  basic  spa¬ 
tial  iclations  (e.g.  up/down,  front^ack),  a  set 
of  physical  ontological  concepts  (e.g.  entity, 
container),  and  a  set  of  basic  experiences  or 


actions  (e.g.  eating,  moving).  According  to  La¬ 
koff,  all  other  concepts  that  do  not  emerge  di¬ 
rectly  out  of  physical  experience  must  be  met¬ 
aphoric  in  nature.  Lakofr  and  colleagues  fur¬ 
ther  propose  that  these  metaphoric,  or  abstract 
concepts  are  understood  and  structured  through 
on-line  metaphorical  mappings  from  a  small  set 
of  fundamental  experiential  concepts.  This  pa¬ 
per  aims  to  test  the  psychological  validity  of 
the  metaphoric  theory  of  mental  representation. 

Lakoff  and  his  colleagues  have  noted  the 
presence  of  many  large-scale  systems  of  con¬ 
ventional  conceptual  metaphors;  ca.ses  in  which 
language  from  one  domain  is  used  in  other  do¬ 
mains  (Lakoff  &  Johnson,  1980,  Lakoff  & 
Kovecses,  1987).  These  conventional  meta¬ 
phors  can  often  be  characterized  as  belonging 
to  a  particular  source-to- target  mapping:  e.g., 
MIND  IS  A  CONTAINER,  IDEAS  ARE 
FOOD.  In  keeping  with  the  IDEAS  ARE  FOOD 
schema,  for  example,  a  reader  might  be  reluc¬ 
tant  to  “swallow  Lakoff  s  claim”  because  they 
haven’t  yet  gotten  to  “the  meaty  part  of  the  pa¬ 
per,”  or  because  they  “just  can’t  wait  to  really 
sink  their  teeth  into  the  theory.” 

Such  linguistic  patterns  suggest  that  many 
conceptual  domains  can  be  described  system¬ 
atically  in  terms  of  more  tangible  and  familiar 
domains  (as  in  the  IDEAS  ARE  FOOD  sche¬ 
ma  described  above).  However,  whether  these 
large-scale  schemas  arc  psychologically  real 
conceptual  systems  or  post-hoc  theoretical  con¬ 
structs  remains  an  open  question. 

In  this  paper,  I  will  highlight  a  set  of  rela¬ 
tional  similarities  between  the  conceptual  do¬ 
mains  of  space  and  time,  consider  several  ex¬ 
planations  of  how  these  similarities  may  have 
come  about,  and  describe  two  experiments  that 
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distinguish  between  these  explanations.  The 
described  experiments  will  directly  test  the  psy¬ 
chological  validity  of  Lakoff  s  claim  that  ab¬ 
stract  conceptual  domains  are  structured  by 
metaphorical  mappings  from  more  concrete  ex¬ 
periential  domains.  Let  us  now  focus  on  the 
domain  of  time. 

SPATIAL  METAPHORS  FOR  TIME 

We  often  talk  about  time  in  terms  of  space. 
Whether  we  are  looking  forward  to  a  brighter 
tomorrow,  proposing  theories  ahead  of  our 
time,  or  idWmgbehind  schedule,  we  are  relying 
on  terms  from  the  domain  of  space  to  talk  about 
time.  There  is  an  orderly  and  systematic  corre¬ 
spondence  between  the  domains  of  time  and 
space  in  language  (Bennett,  1975;  Clark,  1973; 
Lehrer,  1990;  Traugott,  1978). 

The  correspondences  between  space  and 
time  in  language  might  give  us  insight  into  how 
we  mentally  represent  time.  Let  us  focus  on  the 
event-sequencing  aspect  of  conceptual  time,  the 
system  whereby  events  are  temporally  ordered 
with  respect  to  each  other  and  to  the  speaker 
(e.g.  “The  worst  is  behind  us”  or  “Thursday  is 
before  Saturday.”)  In  order  to  capture  the  se¬ 
quential  order  of  events,  time  is  generally  con¬ 
ceived  as  a  one-dimensional,  directional  enti¬ 
ty.  The  spatial  terms  we  import  to  talk  about 
time  are  also  one-dimensional,  directional  terms 
such  as  ahead/behind,  up/down,  as  opposed  to 
multi-dimensional  or  symmetric  terms  such  as 
shallow/deep,  left/right.  This  pattern  is  stable 
across  languages,  and  overall,  spatial  terms  re¬ 
ferring  to  front/back  relations  are  the  ones  most 
widely  borrowed  into  the  domain  of  time  cross- 
linguistically  (Clark,  1973;  Traugott,  1978). 

As  most  abstract  domains,  the  domain  of 
time  can  be  described  through  more  than  one 
metaphor.  In  English,  there  are  two  dominant 
space  — >time  metaphoric  systems  (Clark, 
1973;  Lakoff  &  Johnson,  1980).  The  first  sys¬ 
tem  can  be  termed  the  ego-moving  metaphor, 
where  “ego”  or  the  observer’s  context  progress¬ 
es  along  the  time-line  toward  the  future  as  in 
"Wle  are  coming  up  on  Christmas*'  (see  Figure 
la).  The  second  system  is  the  time -moving 


metaphor.  In  this  metaphor,  a  time-line  is  con¬ 
ceived  of  as  a  river  or  conveyor  belt  on  which 
events  are  moving  from  the  future  to  the  past 
as  in  ^'Christmas  is  coming**  (see  Figure  lb). 
These  two  systems  lead  to  different  assignments 
of  front/back  to  a  time-line  (Clark,  1973;  Fill-> 
more,  1979;  Lakoff  &  Johnson,  1980;  Traugott, 
1978).  In  the  ego-moving  system,  front  is  as¬ 
signed  to  the  future  or  later  event  (e.g.  “The 
war  is  behind  us”  or  “His  whole  future  is  be¬ 
fore  him”).  In  the  time-moving  system,  front  is 
assigned  to  a  past  or  earlier  event  (e.g.  “I  will 
see  you  before  4  o’clock”  or  “The  reception 
after  the  talk.”) 

Although  the  apparent  systematicity  and 
coherence  of  the  ego-moving  and  time-mov¬ 
ing  systems  in  temporal  language  is  compel¬ 
ling,  a  priori  it  is  not  clear  that  any  structured 
conceptual  schemas  are  necessary  to  process 
metaphoric  expressions  about  time. 
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Figure  la.  Ego-moving  schema 
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Figure  lb.  Time-moving  schema 
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It  could  be  the  case  that  “metaphoric”  ex¬ 
pressions  are  simply  polysemous  expressions. 
That  is,  the  “metaphoric”  meaning  is  stored  as 
a  secondary  meaning  in  the  lexical  entiy  of  the 
base  term.  A  word  like  “ahead,”  for  example, 
might  have  two  (or  more)  word  senses  associ¬ 
ated  with  it:  *in  front  of  spatially’  and  *in  front 
of  temporally’.  If  this  is  the  case,  one  need  not 
carry  out  any  structured  mapping  between  do¬ 
mains  in  order  to  understand  what  “ahead” 
means  in  a  temporal  context. 

There  is  some  evidence  for  this  alternative 
hypothesis.  For  example,  Glucksberg,  Brown,  and 
McGlone  (1993)  showed  that  people  do  not  ac¬ 
cess  the  “anger  is  heat”  metaphor  when  process¬ 
ing  conventional  idioms  such  as  “lose  one’s  cool.” 

It  is  possible,  then,  that  the  ego-moving  and 
time-moving  metaphors  are  only  language-deep 
— etymological  relics,  not  psychologically  real 
conceptual  schemas.  In  order  to  establish  the 
psychological  reality  of  these  event-sequenc¬ 
ing  schemas,  we  must  first  be  able  to  empiri¬ 
cally  distinguish  between  expressions  that  are 
simply  polysemous,  and  those  that  are  pro¬ 
cessed  as  parts  of  globally  consistent  concep¬ 
tual  schemas. 

EVIDENCE  FOR  TWO  DISTINCT 

EVENT-SEQUENCING  SCHEMAS 

To  investigate  whether  the  ego-moving  and 
time-moving  conceptual  schemas  are  used  in 
real-time  language  comprehension,  Gentner, 
Imai,  and  Boroditsky  (in  preparation)  measured 
processing  time  for  statements  using  event-se¬ 
quencing  expressions  presented  either  consis¬ 
tently  or  inconsistently  with  respect  to  either 
the  ego-moving  or  the  time-moving  schema. 
They  reasoned  that  if  temporal  expressions  were 
processed  as  parts  of  globally  consistent  con¬ 
ceptual  schemas,  then  processing  should  be  flu¬ 
ent  if  the  expressions  are  kept  consistent  to  one 
schema  (processing  time  should  remain  con¬ 
stant).  If  the  schemas  are  switched,  however, 
processing  should  be  disrupted,  and  process¬ 
ing  time  should  increase  as  it  would  take  extra 
time  to  discard  the  old  conceptual  structure  and 
set  up  a  new  one. 


Participants  were  presented  with  a  block 
of  temporal  statements  that  were  either  consis¬ 
tent  to  one  schema,  or  switched  between  the 
ego-moving  and  time-moving  schemas.  For 
each  statement  (e.g.  Christmas  is  six  days  be¬ 
fore  New  Year’s  Day),  participants  were  given 

a  time-line  of  events  (eg.  Past . New  Year’s 

Day . Future),  and  had  to  place  an  event  (in 

this  case  Christmas)  on  the  timeline.  Response 
time  data  showed  that  switching  schemas  did 
indeed  increase  processing  time. 

Another  study  was  conducted  at  Chicago’s 
O’Hare  airport  where  participants  were  passen¬ 
gers  not  aware  of  being  in  a  psychological  study. 
Participants  were  approached  by  the  experi¬ 
menter  and  asked  a  priming  question  in  either 
the  ego-moving  form  (Is  Boston  ahead  or  be¬ 
hind  us  in  time?)  or  the  time-moving  form  (Is 
it  earlier  or  later  in  Boston  than  it  is  here?).  After 
the  participant  answered,  the  experimenter 
asked  the  target  question  (So  should  I  turn  my 
watch  forward  or  back?)  which  was  consistent 
with  the  ego-moving  form.  Response  times  for 
the  target  question  were  collected  with  a  stop¬ 
watch  disguised  as  a  wristwatch.  Once  again, 
response  times  for  consistently  primed  ques¬ 
tions  were  shorter  than  for  inconsistently 
primed  questions.  Switching  schemas  increased 
processing  time.  These  results  suggest  that  there 
are  two  distinct  conceptual  schemas  that  are 
involved  in  sequencing  events  in  time. 

Converging  evidence  comes  from  a  study 
by  McGlone,  Harding,  and  Glucksberg 
(1994).  Participants  answered  blocks  of  ques¬ 
tions  about  days  of  the  week  phrased  in  ei¬ 
ther  the  ego-moving  or  the  time-moving  met¬ 
aphor.  The  ego-moving  blocks  were  com¬ 
posed  of  statements  like  “We  passed  the 
deadline  yesterday.”  The  time-moving  blocks 
were  composed  of  statements  like  “The  dead¬ 
line  was  passed  yesterday.”  For  each  state¬ 
ment  participants  were  asked  to  indicate  the 
day  of  the  week  that  the  event  in  the  state¬ 
ment  had  occurred  or  will  occur.  At  the  end 
of  each  block,  participants  were  presented 
with  an  ambiguous  temporal  statement  such 
as  “Friday’s  game  has  been  moved  forward  a 
day,”  and  were  asked  to  perform  the  same 
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task.  The  above  statement  is  ambiguous  be¬ 
cause  it  could  be  interpreted  using  one  or  the 
other  schema  to  yield  different  answers,  Mc- 
Glone  et  al.  found  that  participants  in  the  ego- 
moving  condition  tended  to  disambiguate  the 
above  statement  in  an  ego-mdving-consi stent 
manner  (thought  the  game  was  on  Saturday), 
and  participants  in  the  time-moving  condi¬ 
tion  tended  to  disambiguate  in  a  time-mov¬ 
ing-consistent  manner  (thought  the  game  was 
on  Thursday). 

These  studies  provide  evidence  for  the 
existence  of  two  distinct,  globally  consistent 
conceptual  schemas  for  sequencing  events  in 
time.  The  challenge  now  is  to  show  that  these 
two  large-scale  schemas  are  imported  from 
the  experiential  domain  of  space  to  the  ab¬ 
stract  domain  of  time  during  real-time  pro¬ 
cessing,  as  the  metaphorical  representation 
hypothesis  would  imply. 

A  reasonable  alternative  to  the  metaphor¬ 
ical  representation  hypothesis  was  proposed 
by  Murphy  (1996)  who  argued  that  all  do¬ 
mains  are  represented  directly,  not  metaphor¬ 
ically.  According  to  Murphy’s  Structural 
Similarity  hypothesis,  metaphorical  language 
arises  from  pre-existing  structural  similari¬ 
ties  between  two  domains.  The  two  domains 
are  represented  separately,  but  are  quite  sim¬ 
ilar,  and  it  is  this  conceptual  similarity  that 
allows  people  to  construct  understandable 
verbal  metaphors. 

Before  we  can  empirically  distinguish  be¬ 
tween  these  two  hypotheses,  we  need  to  make 
explicit  the  analogy  between  the  schemas  used 
to  order  events  in  time,  and  the  schemas  used 
to  order  objects  in  space.  That  is,  if  some  struc¬ 
tures  in  time  are  metaphors  from  space,  what 
are  the  spatial  schemas  that  serve  as  the  base  of 
this  metaphor? 

STRUCTURAL  SIMILARITIES 

BETWEEN  SPACE  AND  TIME 

Many  structural  similarities  exist  between 
the  conceptual  domains  of  space  and  time.  A 
set  of  spatial  analogs  for  the  ego-moving  and 
time-moving  schemas  is  proposed  below. 


The  Ego-moving  schema 

According  to  the  ego-moving  schema, 
events  in  the  domain  of  time  are  ordered  with 
respect  to  the  observer’s  direction  of  motion. 
The  front  of  an  ego-moving  scenario  is  assigned 
as  the  furthest  point  in  the  observer’s  direction 
of  motion.  Since  in  the  domain  of  time  the  ob¬ 
server  is  inevitably  moving  from  the  past  to  the 
future,  front  is  assigned  to  future  or  later  events. 
An  analogous  schema  exists  for  ordering  ob¬ 
jects  in  a  line  (see  Figure  2).  When  an  observer 
moves  along  a  path,  objects  are  ordered  accord¬ 
ing  to  the  direction  of  motion  of  the  observer. 
In  the  example  in  Figure  2,  the  dark  can  is  said 
to  be  in  front  because  it  is  further  along  in  the 
observer’s  direction  of  motion. 

Time-  or  Object-moving 

According  to  the  time-moving  schema, 
events  in  time  are  ordered  based  on  the  direc¬ 
tion  of  motion  of  time.  The  front  of  a  time-mov¬ 
ing  scenario  is  the  furthest  point  in  the  direction 
of  motion  of  time.  Since  time  inevitably  moves 
from  the  future  to  the  past,  front  is  assigned  to 
past  or  earlier  events.  Once  again  an  analogous 
system  exists  for  ordering  objects  in  space  (see 
Figure  3).  When  two  objects  (without  intrinsic 
fronts)  are  moving,  they  are  assigned  fronts  based 
on  their  direction  of  motion.  The  front  here,  just 
as  in  the  domain  of  time,  is  assigned  to  the  lead¬ 
ing  part  of  an  object  in  the  direction  of  motion. 
The  light-colored  widget  is  said  to  be  in  front 
because  it  is  further  along  in  the  widgets’  direc¬ 
tion  of  motion. 

We  are  now  in  a  position  to  ask  whether 
the  same  relational  schemas  are  used  to  se¬ 
quence  objects  in  space  and  events  in  time.  If 
the  same  schemas  are  indeed  used  by  both  do¬ 
mains,  then  we  should  be  able  to  differentially 
prime  particular  spatial  schemas  to  affect  how 
people  think  about  time. 

In  the  first  experiment  we  were  interested 
in  whether  making  subjects  think  about  spatial 
relations  in  a  particular  way,  would  affect  how 
they  think  about  time.  We  primed  either  the  ego- 
moving  or  the  object-moving  spatial  schemas 
by  asking  subjects  to  answer  some  questions 
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about  the  spatial  relations  of  objects  in  a  pic¬ 
ture.  We  then  asked  subjects  to  interpret  an 
ambiguous  temporal  statement  such  as  “Next 
Wednesday’s  meeting  has  been  moved  forward 
two  days.”  If  the  above  sentence  is  interpreted 
using  the  ego-moving  schema,  x\\tT\  forward  h 
in  the  direction  of  motion  of  the  observer,  and 
the  meeting  should  now  fall  on  a  Friday.  In  the 
“time-moving  interpretation,  forward 

is  in  the  direction  of  motion  of  time,  and  the 
meeting  should  now  be  on  a  Monday. 

If  the  domains  of  time  and  space  do  indeed 
use  the  same  relational  schemas,  subjects 
primed  in  the  ego-moving  spatial  frame  of  ref¬ 
erence  should  prefer  the  ego-moving  perspec¬ 
tive  for  reasoning  about  events  in  time,  and 
should  think  that  the  meeting  will  be  on  Fri¬ 
day.  Subjects  primed  in  the  object-moving 
frame  of  reference  should  prefer  the  time-mov¬ 
ing  interpretation  and  think  that  the  meeting  will 
be  on  Monday.  However,  if  the  domains  of 
space  and  time  are  represented  separately,  then 
spatial  primes  should  have  no  effect  on  the  way 
subjects  think  about  time. 

EXPERIMENT  1 

METHOD 

Participants 

63  Stanford  University  undergraduates  par¬ 
ticipated  in  this  study  as  part  of  a  course  re¬ 
quirement. 

Materials  &.  Design 

A  two-page  questionnaire  was  construct¬ 
ed.  The  first  page  contained  four  TRUE/FALSE 
priming  questions  about  spatial  scenarios.  Sce¬ 
narios  used  either  the  ego-moving  frame  of  ref¬ 
erence  (see  Figure  2).  or  the  object-moving 
frame  of  reference  (see  Figure  3).  These  two 
frames  of  reference  were  predicted  to  map  onto 
the  ego-moving  and  time-moving  perspectives 
in  time  respectively.  On  a  separate  page,  par¬ 
ticipants  were  asked  to  read  an  ambiguous  tem¬ 
poral  sentence  (e.g.  “Next  Wednesday’s  meet- 


The  dark  can  is  in  front  of  me. 


Figure  7,  Sampte  ego^moving  scenario. 


The  light  widget  is  in  front  of  the  dark  widget. 

Figure  S.  Sample  object-moving  scenario. 

ing  has  been  moved  forward  two  days  ”)  and 
report  on  which  day  the  meeting  has  been  re¬ 
scheduled.  A  control  group  of  subjects  respond¬ 
ed  to  the  target  sentence  without  having  seen  a 
prime.  All  subjects  also  provided  a  confidence 
score  for  their  answer  to  the  target  question  on 
a  scale  of  1  to  5  (1  -not  at  all  confident,  5=very 
confident). 

Procedure 

Participants  completed  the  two-page  ques¬ 
tionnaire  individually  with  no  time  restrictions. 
The  two  pages  of  the  questionnaire  were  im¬ 
bedded  in  a  large  questionnaire  packet  contain¬ 
ing  questions  unrelated  to  this  study.  No  overt 
connection  was  made  between  the  two  pages 
of  the  questionnaire  pertaining  to  this  study. 

Results 

As  predicted  by  the  metaphorical  represen¬ 
tation  hypothesis,  participants  responded  in  a 
prime-consistent  manner.  Of  the  participants 
primed  in  the  ego-moving  frame  of  reference, 
73.3%  thought  that  the  meeting  was  on  Friday, 
and  26.7%  thought  it  was  on  Monday.  The  re¬ 
verse  pattern  was  true  for  the  participants 
primed  in  the  object-moving  frame  of  reference. 
Only  30.8%  of  the  participants  primed  in  the 
object-moving  frame  of  reference  thought  the 
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meeting  was  on  Friday,  and  69.2%  thought  it 
was  on  Monday.  A  Chi-squared  statistic  con¬ 
firmed  the  effect  of  consistency  (Chi  =  5.2, 
p<.05).  Control  participants  (who  had  not  seen 
any  primes)  were  evenly  split  between  Mon¬ 
day  (45.7%)  and  Friday  (54.3%). 

An  additional  measure  of  confidence  con¬ 
firmed  the  large  consistency  effect.  A  confi¬ 
dence  score  was  computed  for  each  subject  by 
scoring  a  prime-consistent  response  as  a  +1 ,  and 
a  prime-inconsistent  response  as  a  -1  and  mul¬ 
tiplying  by  the  confidence  rating  that  had  been 
provided  by  the  subject  on  a  scale  of  1  to  5. 
Under  the  null  hypothesis,  the  mean  confidence 
score  should  equal  0.  The  mean  observed  con¬ 
fidence  score  for  the  primed  conditions  was 
2.14,  significantly  higher  than  0  (t=2,81,  p<.01), 
once  again  confirming  the  consistency  effect. 
For  the  unprimed  or  control  condition,  the  mean 
confidence  score  did  not  differ  from  the  null 
prediction  (Mean  =  -0.23). 

There  was,  however,  one  concern  about  the 
spatial  scenario  pictures  used  in  Experiment  1. 
Besides,  the  difference  in  underlying  structure, 
there  were  several  superficial  differences  be¬ 
tween  the  ego-moving  and  the  object-moving 
pictures  used.  The  ego-moving  pictures  always 
contained  three  entities,  always  had  a  person  in 
them,  and  contained  only  one  arrow  indicating 
direction  of  motion.  The  object-moving  pic¬ 
tures,  on  the  other  hand,  only  contained  two 
entities,  never  contained  a  person,  and  always 
had  two  arrows  indicating  direction  of  motion. 
It  is  possible  that  any  of  these  superficial  dif¬ 
ferences  could  affect  they  way  people  respond¬ 
ed  to  the  target  question.  We  conducted  a  fol¬ 
low  experiment  to  address  the  above  issues. 

EXPERIMENT  lA 

In  the  follow-up  experiment,  we  redesigned 
the  stimuli  to  minimize  these  superficial  dif¬ 
ferences  between  the  ego-moving  and  object- 
moving  scenarios  (see  Figures  4-5).  A  differ¬ 
ent  group  of  71  Stanford  undergraduates  par¬ 
ticipated  in  the  follow-up  experiment.  Just  as 
in  Experiment  1,  there  was  a  significant  effect 
of  schema  consistency  (Chi  =  6.28,  p<.05). 
Subjects  primed  with  ego-moving  pictures. 


The  hat-box  is  ahead  of  the  Kleenex. 


Figure  4.  Sample  object-moving  scenario  for  Exp.  la 


Figure  5.  Sample  ego-moving  scenario  for  Exp.  la 


chose  the  ego-moving  response  (Friday)  63% 
of  the  time.  Subjects  primed  with  the  object- 
moving  pictures  chose  the  time-moving  re¬ 
sponse  (Monday)  67%  of  the  time. 

In  both  studies,  subjects  were  influenced 
by  the  primed  spatial  schemas  when  tiying  to 
solve  a  problem  about  time.  These  findings 
strongly  suggest  that  structured  relational  in¬ 
formation  is  shared  between  the  domains  of 
space  and  time. 

Discussion 

In  Experiment  1  and  the  follow-up  experi¬ 
ment  we  found  that  priming  particular  spatial 
schemas  can  affect  how  people  think  about 
time.  Participants  chose  to  disambiguate  a  sen¬ 
tence  about  time  in  a  manner  that  was  consis¬ 
tent  with  a  recently  used  spatial  schema.  With 
these  finding  in  hand,  we  can  reject  Murphy’s 
Structural  Similarity  hypothesis  that  states  that 
the  domains  of  space  and  time,  though  similar, 
are  not  related.  Experiments  1  and  1  a  provide 
strong  evidence  for  the  claim  that  the  domains 
of  space  and  time  share  relational  structure. 
However,  it  is  too  early  to  conclude  that  time  is 
understood  and  structured  as  a  metaphor  from 
space.  There  are  two  concerns. 

First,  since  our  data  came  from  a  question¬ 
naire,  we  have  no  direct  measurements  of  the 
real-time  processing  that  went  on  while  subjects 
were  answering  our  questions.  If  we  are  to  claim 
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that  spatio-temporal  expressions  are  processed 
on-line  as  metaphorical  mappings,  we  must  be 
able  to  demonstrate  that  schema  consistency  has 
some  effect  on  real-time  processing.  To  address 
this  concern,  we  need  to  design  a  more  controlled 
laboratory  task  that  will  allow  us  to  assess  sub¬ 
jects’  on-line  processing. 

The  second  concern  has  to  do  with  how  the 
ego-moving  and  time-moving  schemas  are  rep¬ 
resented.  So  far  we  have  established  that  the 
domains  of  space  and  time  are  conceptually  re¬ 
lated,  and  that  they  share  some  relational  sche¬ 
mas.  These  findings  are  consistent  with  the  met¬ 
aphorical  representation  hypothesis  that  struc¬ 
tured  relational  information  is  stored  in  the  do¬ 
main  of  space  and  mapped  to  the  domain  of  time. 
However,  an  alternative  explanation  of  our  re¬ 
sults  is  that  there  are  generic,  domain-indepen¬ 
dent  schemas  that  are  shared  by  both  domains. 
If  we  are  to  claim  that  abstract  domains  like  time 
are  understood  as  metaphors  from  concrete  ex¬ 
periential  domains  like  space,  then  we  must  be 
able  to  show  that  there  is  directional  transfer 
between  the  two  domains.  We  must  show  that 
information  is  transferred  from  space  to  time, 
and  not  from  time  to  space.  Under  the  metaphor¬ 
ical  representation  hypothesis,  the  schema-con¬ 
sistency  effect  should  be  asymmetric;  there 
should  be  a  greater  effect  of  schema-consisten¬ 
cy  when  the  transfer  is  from  space  to  time,  than 
from  time  to  space.  Experiment  2  was  designed 
to  address  the  above  two  concerns. 

In  Experiment  2,  subjects*  response  times 
to  questions  about  spatial  and  temporal  rela¬ 
tions  were  measured.  Each  target  question  was 
preceded  by  two  prime  questions  that  used  ei¬ 
ther  the  same  relational  schema  as  the  target  (a 
Consistent  trial)  or  used  a  different  relational 
schema  (an  Inconsistent  trial).  We  also  varied 
the  domains  from  which  the  target  and  prime 
questions  were  drawn  so  that  sometimes  spa¬ 
tial  primes  were  followed  by  target  questions 
about  time,  and  other  times  temporal  primes 
were  followed  by  target  questions  about  space. 
We  also  included  trials  where  spatial  primes 
were  followed  by  spatial  targets,  and  temporal 
primes  were  followed  by  temporal  targets 
These  trials  were  necessary  as  manipulation 


checks;  we  must  be  able  to  demonstrate  that 
our  stimuli  can  produce  an  effect  of  consisten¬ 
cy  within  a  domain  before  we  can  inteqirct  con¬ 
sistency  effects  across  domains. 

EXPERIMENT  2 
Predictions 

Under  the  metaphorical  representation  hy¬ 
pothesis,  we  would  predict  that  subjects  should 
be  slower  to  respond  to  inconsistently  primed 
items  when  temporal  targets  are  preceded  by 
spatial  primes.  However,  there  should  be  no 
effect  of  consistency  when  spatial  targets  are 
preceded  by  temporal  primes.  The  exact  pre¬ 
dictions  are  below: 

Spatial  primes  to  spatial  targets.  When 
schema-inconsistent  primes  are  used,  the 
primed  inconsistent  schema  will  interfere  with 
processing  and  processing  time  will  increase. 

Spatial  primes  to  temporal  targets.  When 
schema-inconsistent  spatial  primes  arc  used,  the 
inconsistent  spatial  schema  will  become  very 
available.  Since  spatial  schemas  arc  used  on¬ 
line  for  understanding  temporal  scenarios,  this 
inconsistent  schema  will  disrupt  processing 
causing  processing  time  to  increase. 

Temporal  primes  to  temporal  targets. 
When  schema-inconsistent  temporal  primes  arc 
used,  the  product  of  the  mapping  that  was  nec¬ 
essary  to  process  the  prime  scenarios  will  be¬ 
come  most  available.  The  product  of  this  map¬ 
ping  will  be  the  inconsistent  set  of  correspon¬ 
dences  between  the  domains  of  space  and  time. 
This  inconsistent  set  of  correspondences  will 
interfere  with  processing,  causing  processing 
time  to  increase. 

Temporal  primes  to  spatial  targets. 

When  schema-inconsistent  temporal  primes 
are  used,  what  will  become  most  avaihable  is 
the  product  of  the  mapping  that  was  neces¬ 
sary  to  process  the  prime  scenarios.  The  prod¬ 
uct  of  this  mapping  will  be  the  inconsistent 
set  of  correspondences  between  the  domains 
of  space  and  time.  This  set  of  correspondenc¬ 
es  should  have  no  effect  on  the  processing  of 
a  spatial  scenario,  since  the  domain  of  space 
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is  represented  directly,  and  does  not  depend 
on  the  domain  of  time. 

METHODS 

Participants 

34  Stanford  University  undergraduates  par¬ 
ticipated  in  this  study  in  order  to  fulfill  a  course 
requirement. 

Materials 

The  experiment  used  128  prime  questions 
and  32  target  questions.  All  questions  were 
TRUE/FALSE.  Each  prime  question  appeared 
only  once.  Each  target  question  appeared  twice, 
once  primed  Consistently,  and  once  primed 
Inconsistently. 

Time  Questions:  64  statements  about 
months  of  the  year  were  constructed  to  use 
as  primes.  Half  of  these  statements  were 
phrased  using  the  ego-moving  schema  (e.g. 
“In  March,  May  is  ahead  of  us.*’)»  and  the 
other  half  used  the  time-moving  scherha  (e.g. 
“March  comes  before  May.”)  Also,  half  of 
the  statements  were  TRUE  and  half  were 
FALSE.  Half  of  the  statements  referred  to 
months  that  are  “ahead”  or  “before”,  and  half 
of  the  statements  referred  to  months  that  are 
“behind”  or  “after”.  All  of  these  variations 
were  fully  crossed  into  eight  types  of  primes. 
This  was  done  to  insure  that  the  task  was  dif¬ 
ficult  enough  that  subjects  would  not  be  able 
to  develop  a  heuristic  to  answer  the  questions. 
In  addition,  16  statements  about  months  of 


The  tree  is  behind  me. 

Figure  6a.  Sample  ego-moving  spatial  scenario. 


the  year  were  constructed  to  use  as  target 
questions.  These  statements  were  always 
TRUE,  used  either  the  ego-moving,  or  the 
time-moving  schema,  and  always  referred  to 
months  that  are  “ahead”  or  “before”. 

Space  Questions:  64  spatial  scenarios 
were  constructed  to  use  as  primes.  Each  sce¬ 
nario  consisted  of  a  picture  and  a  sentence. 
Half  of  these  scenarios  used  the  ego-moving 
schema,  and  the  other  half  used  the  object- 
moving  schema.  Also,  half  of  the  sentences 
were  TRUE  descriptions  of  the  spatial  rela¬ 
tions  portrayed  in  the  picture  and  half  were 
FALSE  descriptions.  Half  of  the  statements 
referred  to  objects  that  were  “in  front”,  and 
half  referred  to  objects  that  are  “behind”.  All 
of  these  variations  were  fully  crossed  into 
eight  types  of  primes.  Also,  left/right  orien¬ 
tation  of  the  pictures  was  counterbalanced 
across  these  variations. 

In  addition,  16  spatial  scenarios  were  con¬ 
structed  to  use  as  target  questions.  Sentences 
in  these  scenarios  were  always  TRUE  descrip¬ 
tions  of  the  picture,  used  either  the  ego-mov¬ 
ing,  or  the  object-moving  schema,  and  always 
referred  to  objects  that  are  “in  front”.  Sample 
items  can  be  found  in  Figure  6. 

Design 

Overall,  the  experiment  has  a  three  fac¬ 
tor  fully  crossed  within  subjects  design.  The 
design  is  4  (transfer  type)  X  2  (consistency) 
X  2  (target  type).  There  were  four  levels  of 
transfer  type:  (1)  “space-to-space”  -  transfer 


Figure  6b.  Sample  object-moving  spatial  scenario. 
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from  spatial  primes  to  spatial  targets;  (2) 
“space- to-ti me”  -  transfer  from  spatial  primes 
to  temporal  targets;  (3)  “time-to-time”  -  trans¬ 
fer  from  temporal  primes  to  temporal  targets; 
and  (4)  “time-to-space”  -  transfer  from  tem¬ 
poral  primes  to  spatial  targets.  There  were 
two  levels  of  consistency:  (1)  consistent  -  the 
primes  and  targets  belong  to  the  same  sche¬ 
ma;  or  (2)  inconsistent  -  the  primes  and  tar¬ 
gets  belong  to  different  schemas.  There  were 
two  levels  of  target  type:  (1)  ego-moving;  and 
(2)  object/time-moving. 

Each  participant  completed  a  short  prac¬ 
tice  session  and  64  experimental  trials.  Each 
trial  was  composed  of  two  prime  questions  and 
one  target  question.  Each  target  was  present¬ 
ed  twice,  once  in  a  Consistent  trial,  and  once 
in  an  Inconsistent  trial.  A  trial  was  Consistent 
when  the  prime  questions  and  the  target  ques¬ 
tion  belonged  to  the  same  schema  (e.g.  ego- 
moving  prime,  ego-moving  target).  A  trial  was 
Inconsistent  when  the  prime  questions  and  the 
target  question  belonged  to  different  schemas 
(e.g.  ego-moving  prime,  time-moving  target). 
The  critical  measure  was  the  effect  of  consis¬ 
tency  on  the  response  time  to  the  same  target 
question  by  the  same  subject.  Trials  were  ran¬ 
domized  for  each  subject  with  the  constraint 
that  the  order  of  consistent  and  inconsistent 
presentations  of  the  same  target  was  counter¬ 
balanced  across  subjects.  For  each  subject, 
consistent  and  inconsistent  items  appeared 
first  and  second  equally  often. 

Procedure 

Participants  were  tested  individually.  Par¬ 
ticipants  completed  a  short  practice  session  fol¬ 
lowed  by  64  experimental  trials.  Upon  a  par¬ 
ticipant’s  completion  of  the  practice  session, 
the  experimenter  provided  feedback,  and  re¬ 
peated  the  instructions.  There  was  no  feedback 
for  the  64  experimental  trials. 

In  each  trial,  the  participant  saw  2  prime 
questions  followed  by  one  target  question.  Par¬ 
ticipants  did  not  know  that  the  experiment  was 
broken  up  into  such  trials,  nor  could  they  fig¬ 
ure  it  out  just  from  being  in  the  experiment. 


For  each  question  a  participant  needed  to  make 
a  TRUE/FALSE  response.  There  was  a  re¬ 
sponse  deadline  of  six  seconds. 

Results 

Results  are  summarized  in  Figures  7-10.  As 
predicted  by  the  metaphorical  representation 
hypothesis,  subjects  showed  a  schema  consis¬ 
tency  effect  when  the  direction  of  transfer  was 
from  space  to  time,  but  not  from  time  to  space. 
Subjects  also  showed  within-domain  consisten¬ 
cy  effects  in  both  the  space-to-space  transfer 
condition,  and  the  time-to-time  transfer  condi¬ 
tion.  For  each  transfer  type,  a  three-factor  (2 
Consistency  X  2  Target  type  X  34  Subjects) 
GLM  analysis  was  conducted.  For  each  com¬ 
parison  there  was  a  significant  effect  of  sub¬ 
jects  which  is  to  be  expected  due  to  large  indi¬ 
vidual  differences  in  reaction  time.  Error  re¬ 
sponses  were  omitted  from  all  analyses. 

Within-domain  schema  consistency 

Space-to-space:  See  Figure  7.  Subjects 
responded  significantly  faster  to  Consistent¬ 
ly  presented  targets  (mean  RT  =  1 590  msecs), 
than  to  Inconsistently  presented  targets  (mean 
RT  =  1700  msecs)  when  both  the  prime  and 
target  questions  came  from  the  domain  of 
space  (F=  5.01,  p<.05).  Establishing  this 
within-domain  consistency  effect  was  neces¬ 
sary  as  a  manipulation  check.  There  was  no 
Interaction  between  Target  type  and  Consis¬ 
tency.  This  means  that  both  ego-moving  and 
object-moving  targets  benefited  equally  from 
Consistency. 

Time-to-time:  See  Figure  8.  Subjects  re¬ 
sponded  faster  to  Consistently  presented  targets 
(mean  RT  =  1761  msecs),  than  to  Inconsistently 
presented  targets  (mean  RT  =  1 896  msecs)  when 
both  the  prime  and  target  questions  came  from 
the  domain  of  time  (F=4.42,  p<.05).  Establish¬ 
ing  this  within-domain  consistency  effect  was 
necessary  as  a  manipulation  check.  There  was 
no  effect  of  Target  type,  and  no  interaction  be¬ 
tween  Target  type  and  Consistency.  This  means 
that  both  ego-moving  and  time-moving  targets 
benefited  equally  from  Consistency. 
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Consistency  Consistency 


Figure  7.  Space-to-space  results. 

Cross~domain  schema  consistency 

Space-tO’time:  See  Figure  9.  Subjects  re¬ 
sponded  significantly  faster  to  Consistently  pre¬ 
sented  targets  (mean  RT  =  1973  msecs),  than 
to  Inconsistently  presented  targets  (mean  RT  = 
2088  msecs)  when  spatial  prime  questions  pre¬ 
ceded  temporal  target  questions  (F=4.20, 
p<.05).  This  schema-consistency  effect  means 
that  there  was  transfer  from  the  domain  of  space 
to  the  domain  of  time.  This  finding  corrobo¬ 
rates  the  hypothesis  that  people  use  spatial  sche¬ 
mas  to  think  about  time.  There  was  no  effect  of 
Target  type,  and  no  interaction  between  Target 
type  and  Consistency,  This  means  that  both  ego- 
moving  and  time-moving  targets  benefited 
equally  from  Consistency. 

Time-to-space:  See  Figure  10.  There  was 
no  effect  of  Consistency  in  this  condition 
(F=.71,  p=.4).  Response  time  to  Consistently 
presented  targets  (mean  RT  =  1562  msecs)  did 
not  differ  at  all  from  response  times  to  Incon¬ 
sistently  presented  targets  (mean  RT  =  1571 
msecs).  This  means  that  there  was  no  transfer 
from  the  domain  of  time  to  the  domain  of  space. 
There  was  no  interaction  between  Target  type 
and  Consistency,  meaning  that  the  ego-  and 
object-moving  targets  were  equally  unaffected 
by  Consistency. 


Figure  8.  Time^to-time  results. 

These  results  are  consistent  with  the  hy¬ 
pothesis  that  temporal  scenarios  are  understood 
and  structured  in  terms  of  on-line  mappings 
from  the  domain  of  space.  These  findings  are 
also  consistent  with  the  results  of  Experiments 
1  and  1  a.  These  results  confirm  that  the  domains 
of  space  and  time  share  structured  relational 
information  on-line.  Furthermore,  we  found 
that  the  transfer  is  directional;  there  is  an  effect 
of  schema  consistency  when  the  transfer  is  from 
space  to  time,  but  not  the  reverse. 

DISCUSSION 

Consistent  with  the  metaphorical  represen¬ 
tation  hypothesis,  we  find  an  asymmetry  in  the 
sharing  of  relational  information  between  the  con- 
cq)tual  domains  of  space  and  time.  There  was  an 
effect  of  schema-consistency  when  temporal  tar¬ 
gets  were  preceded  by  spatial  primes:  subjects 
were  slowed  in  solving  problems  about  temporal 
relations  if  they  had  just  completed  schema-in- 
consistent  problems  about  spatial  relations.  There 
was  no  effect  of  schema-consistency  when  spa¬ 
tial  taigets  were  preceded  by  temporal  primes: 
subjects  were  not  slowed  in  solving  problems 
about  spatial  relations  if  they  had  just  completed 
schema-inconsistent  problems  about  temporal 
relations.  These  findings  support  the  metaphori- 
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Consistency 

Figure  9,  Spaee-i0‘fime  results. 


cal  representation  hypothesis.  There  appears  to 
be  directional,  on-line  transfer  of  information  from 
the  concrete  domain  of  space  to  the  abstract  do¬ 
main  of  time.  Results  described  above  disconfirm 
the  alternative  hypothesis  that  the  conceptual  do¬ 
mains  of  space  and  time  share  generic  domain- 
independent  relational  schemas.  Ego-moving  and 
object-moving  schemas  appear  to  be  imported 
(borrowed)  on-line  from  the  domain  of  space,  and 
used  to  organize  events  in  time. 

CONCLUSIONS 

We  found  that  people  understand  time  in 
terms  of  space,  but  not  space  in  terms  of  time.  In 
Experiment  1 ,  subjects  were  influenced  by  spa¬ 
tial  perspective  when  reasoning  about  events  in 
time.  In  Experiment  2,  we  showed  that  subjects 
were  slowed  in  processing  temporal  statements  if 
they  were  primed  with  an  inconsistent  spatial  sche¬ 
ma.  Tliis  effect  of  consistency  was  present  only 
in  transfer  from  space  to  time,  and  not  from  time 
to  space,  indicating  that  there  is  a  directional  struc¬ 
ture-mapping  between  these  two  domains.  These 
findings  lend  support  to  the  metaphorical  theory 
of  representation.  It  appears  that  abstract  domains 
such  as  time  are  inde^  structured  on-line  as  met¬ 
aphorical  mappings  from  more  concrete  and  ex¬ 
periential  domains  such  as  space. 


Figure  tO,  Time-tn-spacr  results. 

It  is  Still  unclear,  however,  whether  linguis¬ 
tic  metaphors  shape  the  way  we  think  about 
abstract  domains,  or  whether  they  simply  re¬ 
flect  pre-existing  conceptual  mappings.  A  set 
of  cross-linguistic  studies  is  currently  under¬ 
way  examining  the  role  language  in  shaping 
abstract  thought. 
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ABSTRACT 

There  is  a  substantial  controversy  con¬ 
cerning  the  role  of  metaphor  in  conceptual 
structure.  According  to  one  view  (e.g.  Lakoff 
and  Johnson,  1980,  Gibbs,  1996)  some  ba¬ 
sic,  direct  experiences  are  used  to  structure 
and  conceptualize,  by  the  mean  of  metaphor, 
other,  more  abstract  and  not  directly  experi¬ 
enced  matter.  Another  view  (e.g.  Keil,  1986, 
Centner,  1989,  Murphy,  1996a)  treats  con¬ 
ceptual  metaphors  as  a  kind  of  cross-domain 
structural  analogies  which  could  be  construct¬ 
ed  only  on  the  basis  of  pre-existing  concep¬ 
tual  knowledge  in  both  domains.  Some  em¬ 
pirical  evidence  based  on  the  study  of  prim¬ 
ing  in  complex  metaphor  comprehension  is 
presented  for  either  position,  however  deep¬ 
er  theoretical  analysis  favours  structural  anal¬ 
ogy  hypothesis. 

INTRODUCTION 

Overview  of  metaphoric  representation 
hypothesis 

There  is  a  lot  of  work  done,  mostly  within 
cognitive  linguistics,  to  demonstrate  that  ab¬ 
stract  concepts  like  social  engagements  (love) 
and  position,  emotion,  or  mental  states  are  rep¬ 
resented  metaphorically  (later  in  this  paper 
“MRH”  abbreviation  stands  for  metaphoric 
representation  hypothesis).  The  idea  is  that  im¬ 
age-schemata  of  bodily  experiences,  such  as 


orientation,  containment  etc.  provide  both 
structure  and,  at  least  partly,  meaning  for  more 
abstract  concepts  such  as  love,  anger,  argu- 
ment  or  social  position  (Lakoff  and  Johnson, 
1980;  Johnson,  1987,  Lakoff,  1987,  1991; 
Gibbs,  1992,  1994).  The  evidence  for  that 
claim  originates  mostly  from  linguistic  data: 
the  examination  of  frequency  and  commonal¬ 
ties  between  languages  of  the  use  of  idiomat¬ 
ic  expressions,  and  on  soundness  of  this  kind 
of  metaphor.  The  common  example  is  LIFE 
IS  A  JOURNEY  metaphor,  where  journey  re¬ 
lies  closely  on  experiencing  moving,  and  life 
is  an  abstract  concept. 

The  hypothesis  however  was  never  taken 
seriously  under  examination  in  psychology 
(Baranski,  1996;  Murphy,  1996a).  Gibbs  (1 996) 
in  his  discussion  with  Murphy,  cites  several 
psychological  data  as  supporting  the  metaphoric 
concepts  hypothesis,  but  the  evidence  seems  to 
be  indirect  at  best.  On  the  other  hand  Murphy’s 
(1996a,  b)  criticise  linguistic  evidence,  and 
claims  that  the  universality  of  some  idiomatic 
expressions  could  be  better  explained  by  ap¬ 
pealing  to  the  structural  similarity  hypothesis. 
That  claim  seems  to  be  much  better  nested  in 
the  experimental  data  (Gentner,  1989;  Medin, 
Goldstone,  and  Gentner,  1993). 

Siructura!  similarity  hypothesis 

Structural  similarity  hypothesis  assumes  that 
analogies,  metaphors  and  structural  similes  based 
on  systematicity  of  the  mapping  between  base 
and  target  domains.  The  systematicity  is  CvStimat- 


320 


Metaphor;  shared  experience  structure  or  cross-domain  analogy? 


ed  over  the  size  and  structure  of  the  relational 
system  matched  between  domains,  with  higher 
order  relation  (e.g.  causal  ones)  weighted  higher 
than  first  order  relations  (e.g.  bigger  than)  and 
still  higher  than  object  attributes  (e.g.  perceptu¬ 
al  properties  like  colour  or  size). 

Developmental  consequences  of  the  MRH 

On  our  part  we  would  like  to  note  that, 
when  considered  within  psychological  theory, 
MRH  is,  first  of  all,  developmental  hypothesis. 
The  crucial  test  for  it  is  then  to  demonstrate 
that  (1)  schematic  representation  of  bodily  ex¬ 
periences  precedes  acquisition  of  abstract  con¬ 
cepts,  and  (2)  children  use  the  structure  of  their 
bodily  experiences  to  acquire  abstract  concepts. 
No  one  however  tested  these  hypotheses  direct¬ 
ly.  The  analysis  of  contemporary  developmen¬ 
tal  studies  gives  no  evidence  either  for  the  sec¬ 
ond  claim,  nor  for  the  first,  which  intuitively 
seems  to  be  much  more  plausible.  Although 
development  of  abstract  conceptions  such  as 
concepts  of  mental  activities,  or  social  relations, 
is  perhaps  well  grounded  in  direct  (mostly  per¬ 
ceptual,  but  also  motor)  experiences,  the  paths 
of  development  seems  to  be  separate  at  very 
early  stages  (Baron-Cohen,  1995;  Leslie,  1994; 
Premack  and  Premack,  1995;  but  see  also  Smith 
and  Katz,  1996,  for  alternative  view).  Also 
many  early  analogies  in  young  children,  even 
if  superficially  similar  to  Lakoff  and  Johnson’s 
orientational  metaphor  are  more  likely  to  be 
based  dn  structural  similarity  (Centner  and 
Ratterman,  1991;  Centner  et  al.,  1995). 

Reanalyses  of  the  studies  of  early  use  of 
metaphor,  which  were  not  directly  designed 
to  test  MRH,  gives  no  clear  Support  for  it  too. 
For  example  in  our  study  of  the  pre-school¬ 
ers*  understanding  of  physical  transfer  met¬ 
aphor  for  mental  actions  (Haman,  1991, 
1997)  we  found  the  pattern  of  answers  which 
could  be  better  explained  by  structural  simi¬ 
larity  hypothesis  than  MRH.  Children  be¬ 
tween  four  and  seven  asked  to  interpret  such 
expressions  as  “to  give  an  idea”  or  “thoughts 
scattered”  in  the  context  of  their  play  and 
school  activities  demonstrated  a  developmen¬ 
tal  shift  from  mixed  to  mehtal  interpretation. 


While  younger  children  correctly  interpret¬ 
ed  “to  give  an  idea”  as  “to  tell  it  out”,  they 
also  incorrectly  inferred  that  only  the  recipi¬ 
ent  will  have  the  idea  when  given.  Older  chil¬ 
dren  correctly  interpreted  both  parts  of  the 
metaphor.  As  far  this  result  could  be  an  evi¬ 
dence  for  the  hypothesis  that  abstract  repre¬ 
sentation  of  mental  transfer  is  build  step  by 
step  on  the  physical  transfer  metaphor.  How¬ 
ever  we  have  found  very  few  purely  literal 
interpretations  even  in  youngest  children. 
Moreover  in  nonmetaphoric  condition  chil¬ 
dren  asked  exactly  the  same  question  “who 
has  the  idea”,  easily  realised  that  both  agent 
and  recipient  have  it  (or  sometime  even  at¬ 
tributed  the  “copyrights”  to  the  agent  only). 
Metaphor  played  here  misleading  rather  than 
constructive  role.  Second,  the  “U”-shaped 
change  was  observed  in  the  metaphor  under¬ 
standing  task.  There  was  an  intermediate  lev¬ 
el  of  performance,  at  which  children  could 
realise  that  both  agent  and  recipient  have  the 
idea,  but  failed  to  find  “to  give”  =  “to  tell- 
out”  equivalence.  This  pattern  of  results  sug¬ 
gests,  that  there  was  a  change  in  the  structure 
of  children  understanding  of  mental  transfer, 
which  promoted  new  level  of  structure  map¬ 
ping  between  tenor  and  vehicle.  That  change 
seems  to  be  better  linked  with  parallel  devel¬ 
opment  of  theory  of  mind,  rather  than  to  be 
lead  by  physical  transfer  metaphor.  We  as¬ 
sume  then  that  the  study  of  cross-domain 
structure  mapping  and  transfer  is  the  impor¬ 
tant  part  of  empirical  exploration  of  MRH  vs. 
structural  similarity  trade-off. 

DOMAIN  SPECIFICITY  AND 
METAPHOR 

There  are  several  studies  demonstrating 
that  metaphor  understanding  proceeds  do¬ 
main  by  domain  rather  than  single  term  to 
term  (Keil,  1986;  Kelly  and  Keil,  1987; 
Tourengeau  and  Sternberg,  1982).  To  some 
extent  it  is  also  congruent  with  MRH.  How¬ 
ever,  contrary  to  MRH  it  was  found  that  some 
fair  level  of  understanding  and  structuriza¬ 
tion  in  both  domains  (tenor  and  vehicle)  was 
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required  to  establish  a  metaphoric  relation 
(Keil,  1986).  Note,  that  at  least  some  of  the 
metaphors  used  in  Keil’s  study  matched  the 
type  of  metaphors  considered  in  MRH.  We 
are  going  now  to  discuss  domain  specific 
conceptual  representations  and  their  links  to 
MRH  vs.  structural  similarity  tradc>off. 

Domain  specificity 

There  is  a  good  deal  of  work  elaborating 
the  hypothesis  that  concepts  are  represented 
within  larger  structures:  domains  (Keil,  1986, 
1989;  Hirschfeldt  and  Gelman,  1994a;  Haman, 
1997a).  It  is  hard  to  provide  a  single  and  ex¬ 
haustive  definition  of  the  domain  (see  Hir¬ 
schfeldt  and  Gelman,  1994b,  and  other  papers 
in  the  Hirschfeldt  and  Gelman,  1994a,  volume; 
Haman,  1997a,  and  b).  For  the  current  prob¬ 
lem  two  properties  of  domain  are  the  most  im¬ 
portant  ones:  quasi-modular  status,  and  under¬ 
lying  domain  theories.  First  of  them  conflicts 
with  developmental  interpretation  of  MRH. 
Second  one  provides  some  interesting  conse¬ 
quences  for  metaphorical  asymmetry,  which 
was  claimed  to  support  MRH. 

Foundqfional  domains 

There  is  no  agreement  how  many  do¬ 
mains  are  there,  and  which  of  them  are  de¬ 
velopmental  ly  foundational.  Most  of  the  re¬ 
searchers  agree  however  that  the  physical 
object/mental  entity  distinction  is  based  on 
very  early  cognitive  achievements.  There  is 
no  place  to  discuss  this  issue  here.  While  we 
are  aware  that  there  is  a  lot  of  arbitrariness  in 
that,  we  think  that  it  is  justified  to  assume 
that  at  least  adults  (but  perhaps  already  chil¬ 
dren  at  preschool  age)  represent  mental  and 
social  phenomena,  artefacts,  inanimate  nat¬ 
ural  kinds,  plants,  and  animals  (as  well  as 
some  nominal  kinds  like  language  and  num¬ 
ber)  in  separate  domains.  These  domains  dif¬ 
fer  on  the  dimension  of  complexity  and  con¬ 
sistency  of  their  foundational  theories,  from 
complex  and  consistent  domain  of  animals 
through  plants  and  inanimates,  to  artefacts, 
which  lack  specific  causal  theories' . 


Metaphorical  asymmetry 

One  of  the  problems  which  Gibbs  (1996) 
raises  against  structural  similarity  hypothesis 
is  the  asymmetry  of  foundational  metaphors. 
People  tend,  for  example,  to  speak  about  love 
in  terms  of  the  trip,  but  not  vice-versa.  That 
a.symmetry  isn’t  however  exceptionless.  In  Pol¬ 
ish  for  example  you  could  say:  **On  kipi  z 
wscieklosci”  {He  just  boils  of  furry),  which  is 
classical  example  of  pressure  in  the  container 
metaphor  for  emotions,  however  the  reverse 
“Zupa  w  gamku  kipi  wsciekle”  {Soup  in  the  pot 
boils  furiously)  is  almost  equally  sound.  Indeed, 
Murphy’s  (1996a,  b)  appeal  to  typicality  and 
salience  to  explain  that  asymmetry  is  not  con¬ 
vincing. 

Our  studies  reported  in  Haman  (1997)  al¬ 
lowed  however  to  propose  another  explanation . 
We  have  adopted  Keifs  (1989)  hypothesis,  that 
conceptual  domains  differ  on  the  dimension  of 
representational  complexity  at  the  domain’s 
theory  level  (in  general  natural  kinds,  and  es¬ 
pecially  animate,  are  underlaid  by  complex 
causal  network,  and  that  complexity  declines 
trough  inanimates,  complex  artefacts,  simple 
artefacts  to  nominal  kinds,  while  the  arbitrari¬ 
ness  and  well-defindncss  raises  in  that  direc¬ 
tion). 

Using  metaphor  understanding  and  ad  hoc 
categorization  tasks  we  have  show  that  domains 
which  arc  not  underlaid  with  complex  causal 
theory  are  good  "exporters”  of  objects  and  struc¬ 
tures  to  other  domains  (e.g.  good  metaphor  ve¬ 
hicles),  while  domains  with  riche  theories  tends 
rather  to  assimilate  elements  of  other  domains 
(are  good  “importers”,  or  metaphor  tenors). 

It  is  not  necessarily  obvious  if  thinking 
about  structure  mapping  as  a  process  realised 
by  special  device  like  SME  in  Centner’s  (1 989) 
model.  SME  first  generates  partial  mappings 
and  then  makes  a  decision  on  the  bases  of  max¬ 
imum  systematicity.  On  contrary  we  think  about 
structure  mapping  as  a  process  of  establishing 
cross-domain  links  and  finding  correspondenc¬ 
es  in  conceptual  structure  in  situ.  Highly  inter¬ 
connected  and  coherent  domains  are  much  more 
likely  not  to  generate  a  link  to  less  structured 
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domain,  could  however  easier  find  correspon¬ 
dences  to  the  structure  “imported”  from  other, 
less  structured,  domain.  The  systematicity  prin¬ 
ciple  plays  here  the  same  role  as  in  Centner’s 
model,  but  it  is  computed  here  in  the  context  of 
the  entire  structure  of  the  target  domain  in  the 
problem. 

Rationales  for  the  current  study 

No  one  of  the  studies  noted  above  was  de¬ 
signed  directly  to  test  MRH.  Here  we  are  going 
to  propose  a  method  that  explicitly  contrasts 
metaphoric  representation  hypothesis  with 
cross-domain  structure  mapping.  We  have  de¬ 
signed  two  experiments  as  a  preliminary  test, 
which  could  be  a  starting  point  for  more  exten¬ 
sive  research.  This  is  still  a  study  of  adults* 
metaphor  comprehension,  so  it  leaves  devel¬ 
opmental  issues  still  unexplored . 

EXPERIMENTS 

Overview 

The  method  of  our  experiments  is  based 
on  Boronat  and  Centner  (1990)  compound  (ex¬ 
tended)  metaphor  understanding  study.  Their 
materials  consisted  of  metaphors  composed  of 
two  parts,  each  of  which  was  a  legal  and  com¬ 
plete  metaphor  itself.  Both  parts  could  be  ei¬ 
ther  consistent,  i.e.  their  vehicles  originated 
from  the  same  conceptual  domain,  either  in¬ 
consistent  -  vehicles  originated  from  different 
domains.  In  both  cases,  however,  the  interpre¬ 
tation  of  both  parts  taken  together  provided  a 
coherent  meaning  for  entire  utterance.  Consis¬ 
tent  metaphor  example  is:  Was  Anna  still  boil¬ 
ing  mad  when  you  saw  her?  —  Noy  she  was 
doing  a  slow  simmer.  Inconsistent  metaphor 
could  be:  Was  Anna  still  raging  beast  when  you 
saw  her?  —  Noy  she  was  doing  a  slow  simmer. 
Both  metaphors  have  mutually  same  interpre¬ 
tation,  the  very  same  final  component,  and  very 
similar  structure,  however  the  initial  compo¬ 
nent  of  the  inconsistent  metaphor  use  animals 
domain  as  a  vehicle,  while  in  initial  compo¬ 
nent  of  consistent  metaphor,  as  well  as  of  final 
component  take  fluids  dynamics  as  vehicle. 


The  main  idea  of  the  experiment  was  that 
if  the  metaphors  are  processed  domain  by  do¬ 
main,  first  part  of  the  metaphor  will  prime  the 
understanding  of  the  second  part  only  if  the 
metaphor  was  consistent^ . 

In  our  study  we  have  assumed  that  if 
MRH  is  correct,  then  compound  metaphors 
consistent  in  respect  to  basic  bodily  experi¬ 
ence  will  show  stronger  priming  effect  than 
metaphors  based  on  domain  consistency,  as 
they  access  foundational  conceptual  rela¬ 
tions.  If  cross-domain  structure  mapping  hy¬ 
pothesis  is  correct,  then  domain  knowledge 
and  structure  will  support  priming.  Two  ex¬ 
periments,  described  below  differs,  apart  of 
the  same  geheral  design,  in  the  kinds  of  MRH 
metaphors  explored,  conceptual  domain  def¬ 
inition,  and  the  way  the  priming  effect  was 
assessed.  In  the  experiment  1.  more  general 
domain  conception  was  adopted  (based  on 
Keil’s,  1989,  natural,  artefact,  and  nominal 
kinds  distinction),  and  only  ontological  met¬ 
aphors  (in  Lakoff  and  Johnson,  1980,  sense) 
were  used.  In  Experiment  2.  we  introduced 
more  fine  distinctions  on  both  dimensions. 

EXPERIMENT  1. 

Method 

Subjects.  The  initial  sample  of  24  under¬ 
graduates  paid  for  participation  were  tested. 
Nine  of  them  were  excluded  from  the  final  anal¬ 
ysis  because  of  extreme  variance  in  their  re¬ 
sults  (in  respect  to  2SD  threshold  criterion),  so 
the  final  simple  consisted  of  15  subjects. 

Materials.  A  set  of  component  metaphors 
was  created.  Each  of  the  components  (sim¬ 
ple  metaphor)  could  be  classified  into  one  of 
the  Lakoff  and  Johnson  (1980)  schemes  (part/ 
whole,  container,  and  path/journey)  and 
Keil’s  (1989)  kind  type  (natural,  artefact,  and 
nominal).  In  order  to  combine  any  domain 
with  any  image-scheme,  the  total  of  36  com¬ 
pound  metaphors  were  created,  so  each  com¬ 
pound  metaphor  was  either  consistent  both 
on  Lakoff  and  Johnson’s  (MRH  consisten¬ 
cy),  and  on  Keil’s  dimension  (DSC  -  domain 
structure  consistency),  either  on  one  of  them, 
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or  inconsistent  at  all.  Compound  metaphors 
were  distributed  over  three  equivalent  sets  of 
12  elements.  Five  subjects  from  the  final  sam¬ 
ple  proceeded  with  each  set.  Each  set  con¬ 
sisted  of  three  compound  metaphors  of  each 
type:  MRH+/DSC+,  MRH+/DSC..  MRH-/ 
DSC+,  and  MRH-/DSC-,  where  **+”  means 
metaphor  consistent,  and  inconsistent  at 
a  given  dimension. 

Initial  and  final  components  of  each  meta¬ 
phor  were  matched  in  respect  to  sentence  length 
and  complexity. 

The  example  is:  Company  is  a  complex 
puzzle  composed  of  many  small  elements.  But 
only  such  a  complex  team  could  reach  the 
world-cup  (MTH+  based  on  whole/part  meta¬ 
phor,  DSC*). 

Procedure.  Subjects  were  tested  individ¬ 
ually.  Each  subject  read  an  instruction  on  the 
computer  screen.  The  instruction  asked  sub¬ 
ject  first  to  type  a  space  to  display  the  initial 
component  of  the  metaphor,  read  it,  type  the 
space  again,  and  read  the  final  component, 
then  type  the  space  when  ready  to  explain  the 
entire  (compound)  metaphorical  sentence. 
Only  single  component  sentence  was  dis¬ 
played  at  time.  First  touch  of  the  space  key 
settled  the  clock  on  and  the  second  off,  so 
for  each  compound  metaphor  we  have  got  2 
reaction  times:  one  for  the  first  component 
and  one  for  the  second.  Test  trials  were  pre¬ 
ceded  with  single  training  trial.  Subjects* 
explanations  were  presented  verbally  and 
tape-recorded,  (however  we  will  not  refer  to 
them  in  the  results*  discussion). 

Scoring.  The  priming  effect  was  assessed 
by  estimating  per-cent  of  the  time  necessary 
to  explain  the  entire  metaphor  after  reading 
second  component  in  comparison  to  the  first 
component.  That  was  expressed  in  the  equa¬ 
tion:  (RT1*I00)/RT2,  where  RTl  is  a  time 
of  reading  initial  component  and  RT2  is  time 
necessary  to  explain  the  metaphor  after  the 
final  component  was  displayed.  Finally  for 
each  subject  the  mean  of  three  metaphors  rep¬ 
resenting  the  same  configuration  of  consis¬ 
tency  was  computed. 


Results  and  discussion. 

Table  1 .  summarize  the  results  overall  three 
metaphor  sets.  3x2x2  ANOVA  (set  by  MRH 
consistency  by  DSC,  with  repeated  measures 
within  both  consistency  factors)  was  comput¬ 
ed.  There  were  main  effects  of  MRH 
(F(I;12]=5.61,  p<.035),  and  of  metaphor  set 
(F[2;  1 2]=7.65,  p<.007),  and  DSC  effect  at  ten- 


DSC+ 

DSC- 

Mean 

MTH+ 

155,13 

206.98 

180.56 

MTH- 

191.23 

266.08 

228.66 

Mean 

172.68 

236.94 

204.61 

Tabte  I.  Mean  times  necessary  to  understand  metaphor 
US  a  per-cent  of  RTs  for  the  first  component 

dency  level  (F(1;12]=3.I9,  p=.IO).  No  interac¬ 
tion  effect  was  found. 

Significant  main  effect  of  MTH  and  onl> 
tendency  In  the  direction  of  DSC  seems  to  sup 
port  metaphoric  representation  hypothesis  as  .t 
main  factor  in  metaphor  processing.  Close  look 
on  data  make  that  interpretation  implausible. 
Taken  together,  as  well  separately  in  each  set, 
MTH-/DSC+  metaphors  were  processed  faster 
than  MTH+/DSC-.  The  difference  is  not  very 
large,  and  so  not  significant,  but  systematic.  So 
the  reason  for  not  significant  DSC  effect  is  a 
higher  variance  within  that  category.  We  could 
try  to  explain  the  source  of  that  variance. 

The  most  important  sources  of  error  vari¬ 
ance  (and  perhaps  that  which  caused  exclusion 
of  9  subjects)  were  troubles  to  establish  full 
equivalence  and  relative  soundness  of  metaphor 
components.  Metaphors*  soundness  varied 
across  metaphors  and  sets,  as  documented  by 
significant  set  effect  (however  the  general  pat¬ 
tern  of  results  was  similar  across  sets  -  there 
was  no  interaction  between  set  and  other  fac¬ 
tors).  For  example  it  is  not  easy  to  find  good 
Lakoffian  type  metaphor  based  on  inanimate 
natural  kinds.  As  we  have  used  very  broad  do¬ 
main  concept  some  domain  consistent  meta- 
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phors  have  in  fact  very  little  common  ground. 
To  avoid  part  of  these  problems  we  have 
planned  experiment  2. 

EXPERIMENT!. 

The  aim  of  experiment  2.  was  to  minimize 
variance  related  to  procedure.  We  have  con¬ 
structed  new  material  as  well  as  new  measure 
of  priming  effect.  Finer  distinctions  of  domain 
and  different  types  of  Lakoffian  metaphors  al¬ 
lowed  us  to  asses  additionally  the  role  of  spe¬ 
cific  domain  and  metaphor  type.  We  have  also 
altered  the  method  of  manipulation  and  mea¬ 
suring  priming.  Here  we  vary  only  initial  com¬ 
ponent  (primer).  Final  component  is  the  same 
for  MRH+,  DSC+,  and  inconsistent  metaphors. 

Method 

Subjects,  Twenty-four  undergraduates  paid 
for  participation.  Three  additional  subjects  were 
excluded  because  of  lacking  results. 

Materials,  The  basic  set  of  12  final  com¬ 
ponents  was  created.  For  each  of  them  there 
were  3  different  initial  components:  MRH  con¬ 
sistent  (MRH+),  domain  consistent  (DSC+), 
and  inconsistent  in  either  domain  or  MRH 
(INC),  so  the  product  was  36  compound  meta¬ 
phors  divided  into  3  experimental  sets.  The 
metaphors  with  the  same  final  component  were 
never  included  into  the  same  set.  As  in  the  first 
study,  the  compound  metaphor  as  a  whole  had 
a  congruent  meaning  independently  of  MRH 
or  DSC  consistency. 

Lakoff  and  Johnson  differentiate  several 
types  of  metaphoric  conceptualisation.  Accord¬ 
ing  to  that  MRH+  metaphors  were  farther  clas¬ 
sified  as  structural,  containment,  and  orienta¬ 
tional.  DSC+  metaphors’  vehicle  belonged  to 
one  of  three  domains:  inanimate  natural  kinds, 
plants,  and  animals.  To  achieve  maximum  con¬ 
sistency  of  results  all  MRH+  final  components 
were  based  on  artefact  domain  (we  will  discuss 
that  later,  in  results’  section).  Each  experimen¬ 
tal  set  consisted  of  four  MRH+  metaphors  (one 
of  each  type  with  one  type  doubled),  four  DSC+ 
metaphors  (two  inanimate  natural  kind,  and  two 
animate:  one  plant  and  one  animal),  and  four 


inconsistent  (INC)  metaphors.  An  example  is: 
His  hopes  had  been  like  towers  (MRH+  based 
on  orientational  metaphor:  up=better, 
down= worse)  or  His  hopes  were  like  galloping 
bizons  (DSC+  based  on  the  domain  of  animals), 
or  His  hopes  were  like  attacking  tank  division 
(INC),  with  the  common  final  component:  Af¬ 
ter  some  time  his  hopes  became  to  be  like  a  lit¬ 
tle  bird  dropped  from  the  nest. 

Procedure,  Procedure  was  fully  analogical 
to  that  in  experiment  1 . 

Scoring.  As  the  same  final  components 
were  used  for  different  consistency  models,  we 
have  used  only  RT2,  i.e.  time  necessary  to  ex¬ 
plain  the  metaphor  after  the  second  component 
was  displayed,  as  a  measure  of  priming  effect. 
If  for  a  single  subject  were  more  than  one  RTs 
in  the  same  cell  of  ANOVA  design,  an  average 
was  computed  (e.g.  for  two  DSC+  metaphors 
based  on  inanimate). 

Results  and  discussion. 


MRH+ 

7.184 

DSC+ 

8.269 

INC 

9.133 

Mean 

8.195 

Table  2.  mean  RTs  for  three  types  of  metaphor  ' 
consistency 

Table  2.  contains  summary  results  for  con¬ 
sistency  type.  3x3  ANOVA  (experimental  set  by 
consistency,  with  repeated  measure  within  con¬ 
sistency)  was  performed.  There  was  no  effect  of 
experimental  set,  nor  set  by  consistency  interac¬ 
tion.  The  main  effect  of  consistency  was  highly 
significant  (F[2;45]=7.45,  p<.0015).  Planned 
comparisons  revealed  that  both  MRH+  and  DSC+ 
metaphors  were  faster  processed  than  inconsis¬ 
tent  (INC):  F[l;21 1=12.60,  p<.002  and 
F[1  ;21]=5.14,  p<.035  respectively.  MRH+  meta¬ 
phors  were  also  processed  faster  than  DSC+,  al¬ 
though  the  difference  only  approach  tendency  lev¬ 
el  (F[l;21 1=3.92,  p=.061).  Thus  again  we  have 
got  an  ultimate  evidence  for  both  sources  of  met¬ 
aphor  consistency.  Relatively  faster  performance 


325 


MacIcJ  Haman 


in  the  case  of  MRH+  metaphors  than  DSC+  is 
reasonable,  as  all  MRH+  metaphors  had  artefact 
domain  as  vehicle  in  the  initial  component.  We 
have  argued  earlier,  that  artefacts  are  very  good 
vehicles,  as  they  are  not  underlaid  by  systematic 
causal-explanatoiy  netwodc. 

We  have  performed  also  two  ANOVAs  to 
prove  the  Lakoffian  metaphor  type  (structural,  ori¬ 
entational,  container)  and  domain  (inanimates, 
plants,  animals)  effects.  3x2x3  design  (experimen¬ 
tal  set  by  MRH-I-/DSC+  by  metaphor  type)  gave 
no  significant  effect  or  interaction .  Second  3x2x3 
design  (experimental  set  by  MRH+/DSC+  by  do¬ 
main)  also  gave  no  significant  effect,  there  were 
however  some  interesting  tendencies.  Plant  met¬ 
aphors  were  processed  faster  than  animal  meta¬ 
phors  (F[I;21]=3.46,  p=.077).  The  overall  effect 
of  domain  also  approach  tendency  level  of  sig¬ 
nificance  (F[4;42]=2.25,  p=,l  15).  The  results  are 
congruent  with  our  expectation,  that  metaphors 
based  on  animal  domain  need  more  time  for  pro¬ 
cessing  because  of  complex  net  of  underlying  re¬ 
lations. 

GENERAL  DISCUSSION  AND 
CONCLUSION 

We  have  searched  for  empirical  test  to  deal 
with  metaphoric  representation  vs,  structural  sim¬ 
ilarity  hypothesis  trade-off.  Priming  in  compound 
metaphor  understanding  task  could  provide  some 
date  to  estimate  the  role  of  relations  proposed  by 
MRH  and  domain-specific  naive  theories  in  pro¬ 
viding  a  common  ground  for  both  components  of 
the  metaphor.  As  far  the  results  provide  some  sup¬ 
port  for  both  hypotheses.  There  are  however  some 
not  yet  proved  arguments  for  domain  knowledge 
view.  First,  domain  of  the  metaphor  vehicle  seems 
to  influence  also  processing  of  MRH+  metaphors. 
Second,  we  have  not  found  any  evidence  for  priv¬ 
ileged  position  of  MRH+  consistency,  as  could 
be  expected  if  it  is  a  base  for  conceptual  represen¬ 
tations.  As  Muiphy  ( 1 996a,  b)  argues,  the  MRH-l- 
consistency  also  could  be  explained  by  structure 
similarity,  and  the  recurrent  question  is  what  is 
developmentally  earlier.  It  is  however  very  hard 
to  test.  It  is  important  then  to  search  for  the  evi¬ 
dence  also  in  adult  performance. 


Our  study  was  designed  as  a  pilot  attempt 
to  approach  the  problem  experimentally.  We 
think  that  the  results  give  reasons  to  master  the 
compound  metaphor  task  in  order  to  indepen¬ 
dently  control  consistency,  domains,  and  met¬ 
aphor  soundness. 
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ABSTRACT 

We  argue  that  the  generation  of  every  sen¬ 
tence  involves  first  the  perception  of  analogy 
between  two  conceptual  structures,  and  then  an 
operation  of  linguistic  mapping.  Sentence  in- 
teipretation  starts  with  an  attempt  to  reconstruct 
the  analogical  mapping  configuration  underly¬ 
ing  the  sentence  (i.e.,  the  mapping  operation 
performed  by  the  speaker). 

According  to  this  view,  an  important  role 
of  grammar  is  to  formally  mark  various  ana¬ 
logical  mapping  configurations,  thereby  provid¬ 
ing  cues  to  the  hearer  in  the  interpretation  pro¬ 
cess.  Different  grammatical  systems  have 
evolved  in  different  languages  to  formally  mark 
such  mapping  operations. 

1.  WORKING  ASSUMPTION;  THE 
CONSTRUCTION  GRAMMAR  VIEW 

The  basic  assumption  in  the  analysis  is  the 
Construction  Grammar  view  (as  proposed  by 
Fillmore  &  Kay,  l993,aswellasLakoff,  1987, 
Goldberg,  1995,  and  others),  the  basic  proposi¬ 
tions  of  which  arc  also  shared  by  Langacker’s 
Cognitive  Grammar  approach  (1987,  1991). 
The  assumption  is  that  languages  are  made  up 
of  constructions  •  pairings  of  grammatical 
forms  (syntactic  or  morphological)  and  seman¬ 
tic  structures.  Mastery  of  language  consists  of 
mastery  of  these  form-meaning  pairs.  Syntac¬ 
tic  forms  in  particular  are  associated  with  con¬ 
ceptual  schemas  representing  generic  event 
structures  which  are  basic  to  human  experience, 
such  as  manipulation  of  objects,  bodily  move¬ 


ment  through  space,  and  dynamics  of  force  and 
enablement  (Goldberg,  IW5).  These  schemas 
are  thought  of  as  tools  for  organizing  compre¬ 
hension  and  communication  and  can  structure 
(indefinitely)  many  perceptions,  images,  and 
events  (see  also  the  notions  of  image  schemas 
and  conceptual  archetypes  in  Johnson,  1987, 
Langacker,  l991,Talmy,  1988,  Turner,  1996). 
In  recent  years,  cognitive  scientists  have  found 
strong  evidence  for  the  existence  of  such  event 
schemas.  Examples  include  the  role  of  event 
schemas  in  metaphorical  understanding  (Lakoff 
&  Johnson,  1980,  Sweetser,  1990),  and  as  pre¬ 
cursors  for  language  acquisition  by  children 
(Mandler,  1992,  In  press). 

Given  the  Construction  Grammar  assump¬ 
tion  (and  its  cognitive  linguistics  extensions), 
we  can  now  talk  about  linguistic  entities  such 
as  the  English  Transitive  Construction.  The 
syntactic  form  of  the  construction  is  fNP  V  NP] 
(=SUB  V  OBJ).  Its  associated  semantic  sche¬ 
ma  is  the  archetypal  “transitive”  event  (as  de¬ 
fined,  for  example,  in  Giv6n,  1984):  an  agent 
(typically  human),  who  volitionally  acts  on  (i.e., 
exerts  physical  force  on)  and  affects  another 
entity  (a  patient)’ .  Each  role  in  the  semantic 


*  This  {schematic  event  structure  clearly  represents 
only  the  most  prototypical  sense  of  the  simple  Transitive 
construction  A  full  description  of  this  grammatical  con¬ 
struction  Involves  a  networi  of  extensions  to  the  prototyp¬ 
ical  sense  as  well  as  a  list  of  idiomatic  uses  of  the  construc¬ 
tion  (as,  for  example,  in  Goldberg’s  study  of  constructions. 
1995).  The  networl  description  of  a  construction  is  analo¬ 
gous  to  a  description  of  a  prototypical  sense  of  a  lexical 
Item  which  neatly  always  involves  a  network  of  polysc- 
mous  and  metaphorical  extensions 


328 


Analogy  Underlies  Sentence  Generation  and  Interpretation 


Figure  I.  The  Engtish  Transitive  Construction. 


schema  is  associated  with  one  grammatical  cat¬ 
egory  in  the  syntactic  pattern  (Figure  1):  the 
agent  role  is  associated  with  the  subject  NP, 
the  patient  role  with  the  object  NP,  and  the 
force-dynamic  relation  between  the  two  enti¬ 
ties  is  associated  with  the  main  verbal  slot.  The 
semantic  schema  and  its  association  with  the 
syntactic  form  are  extracted  by  speakers  from 
frequently  encountered  instances  of  the  con¬ 
struction  (i.e.,  instances  of  two-participant  tran¬ 
sitive  sentences). 

2.  ANALOGICAL  PERCEPTION  AND 
MAPPING  OPERATIONS  IN  SENTENCE 
GENERATION. 

Consider  a  basic  transitive  sentence  in  En¬ 
glish,  such  as  “Maty  poisoned  her  lover^’  (gener¬ 
ated,  say,  by  a  detective  investigating  a  murder 
case).  TTie  actual  event  in  the  world  involved  a 
complex  sequence  of  events:  Mary  first  made  a 
(probably  intentional)  decision  to  kill  her  lover. 
She  decided  to  use  poison.  She  found  (or  bought) 
some  poison  and  put  it  in  her  lover’s  food.  The 
lover  ate  it,  felt  sick,  and  after  a  while  died. 

But  at  some  more  abstract  level,  this  se¬ 
quence  of  events  is  also  perceived  by  the  speak¬ 
er  as  an  instance  of  a  more  generic  event  struc¬ 
ture:  An  agent  (Mary)  acting  on  and  affecting  a 
patient  (her  lover).  At  this  abstract  level,  the 


actual  details  of  the  event  are  ignored,  and  an 
analogy  perceived  between  the  high-level 
structure  of  the  novel  event  and  the  “transitive” 
event  schema  (‘Agent  act-on  and  affects  Pa¬ 
tient’).  At  this  level  of  abstraction,  Mary  who 
initiated  the  whole  causal  sequence  of  events 
is  perceived  as  analogous  to  (or  an  instance  of) 
the  Agent  role  in  the  generic  transitive  event 
schema  (while  ignoring  other  intermediate 
causal  forces  involved  in  the  event).  The  lover 
—  the  salient  affected  entity  in  the  causal  event 
sequence  —  is  perceived  as  an  instance  of  the 
Patient  role  in  the  transitive  event  schema 
(while  ignoring  other  less  salient  affected  ob¬ 
jects,  such  as  the  poison  and  the  food  manipu¬ 
lated  by  Mary  as  well). 

The  perception  of  the  analogy  between  the 
high-level  structure  of  the  conceived  novel 
event  and  the  structure  of  the  transitive  event 
schema  motivates  the  speaker  to  use  the  transi¬ 
tive  syntactic  construction  [NP  V  NP]  (associ¬ 
ated  with  the  transitive  schema.  Fig.  1.)  as  a 
linguistic  integrating  frame  (Fauconnier  & 
Turner,  in  press)  for  communicating  the  event 
that  occurred  in  the  world.  The  perceived  anal¬ 
ogy  between  Mary  and  the  Agent  role  in  the 
transitive  event  schema  leads  to  the  linguistic 
association  (or  binding)  of  the  lexical  item 
‘Mary’  (that  represents  the  person  Mary)  with 
the  subject  NP  slot  in  the  Transitive  syntactic 
construction  (that  represents  the  Agent  role  in 
the  transitive  event  schema).  Likewise,  the  per¬ 
ceived  analogy  between  the  murder  victim  (the 
lover)  and  the  Patient  role  in  the  transitive  event 
schema  leads  to  the  linguistic  association  of  the 
phrase  ‘her  lover’  (representing  the  affected 
entity)  with  the  object  NP  slot  in  the  Transitive 
syntactic  construction  (representing  the  Patient 
role  in  the  transitive  event  schema). 

The  speaker  now  also  has  to  choose  which 
aspect  of  the  conceived  event  to  express  through 
the  verbal  slot  of  the  Transitive  construction. 
In  the  sentence  ‘Mary  poisoned  her  lover’,  the 
lexical  item  ‘poison’  denotes  the  substance 
Mary  used  to  affect  (kill)  her  lover.  Note  that 
this  lexical  item  (‘poison’)  represents  only  one 
aspect  in  the  complex  event,  but  this  aspect  is 
considered  central  (or  salient)  enough  to  be  used 
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as  a  linguistic  representative  of  the  whole  event, 
and  as  a  trigger  in  the  hearer’s  mind  for  recon¬ 
struction  of  the  whole  causal  event  sequence. 
Figure  2  illustrates  the  analogical  mapping 
operation  between  the  two  conceptual  struc¬ 
tures:  the  structure  of  the  rich  conceived  event 
(which  is  composed  of  a  sequence  of  tempo¬ 
rally  and  causally  related  sub-events)  and  the 
structure  of  the  Transitive  event  schema.  The 
linguistic  binding  of  lexical  items  (and  their 
phonological  form)  with  syntactic  slots  in  the 
transitive  construction  follows  the  perceived 
conceptual  analogy  and  the  mapping  opera¬ 
tion  across  the  two  conceptual  structures.  This 
linguistic  binding  constitutes  the  basic  opera¬ 
tion  for  sentence  formation  (leading  to  the 
string:  ’Mary  poisoned  her  lover’). 

Note  that  we  did  not  represent  in  Figure 
2  the  “generic  space”  (Fauconnier  &  Turner, 
1994),  which  reflects  the  common  structure 
and  organization  shared  by  the  two  input 
structures  (the  conceived  causal  event  and  the 
Transitive  construction).  It  is  by  virtue  of  this 
common  abstract  structure  that  analogy  can 
be  perceived  and  mapping  performed  across 
the  two  input  structures.  The  generic  struc¬ 
ture  in  Figure  2  is  the  transitive  event  sche¬ 
ma,  representing  both  the  semantics  of  the 
transitive  syntactic  construction,  and  that  of 
the  high-level  abstracted  structure  of  the  con¬ 
ceived  event. 

To  sum,  what  are  the  basic  cognitive  skills 
required  for  the  generation  of  the  sentence  Mary 
poisoned  her  lover  as  a  description  of  the  actu¬ 
al  complex  conceived  event?  The  discussion 
above  suggests  that  the  following  minimal  skills 
are  required: 

(1)  The  ability  to  abstract  the  representation  of 
the  rich  conceived  event  in  the  world  to  a 
level  where  it  shares  structure  and  organi¬ 
zation  with  a  generic  event  schema  (e.g.,  the 
transitive  event  schema  -  ‘Agent  act-on/af- 
fect  a  Patient’).  This  abstraction  operation 
is  not  explicitly  illustrated  in  Figure  2. 

(2)  The  ability  to  perform  the  structural  map¬ 
ping  between  the  two  representations  (an 
example  of  such  mapping  configuration  is 


given  in  Figure  2). 

(3)  Mastering  of  the  conventional  form- 
meaning  associations  In  the  language  both 
between  syntactic  constructions  and  event 
schemas  (e.g.  the  association  between  the 
transitive  syntactic  pattern  [NP  V  NP]  and 
the  transitive  event  schema,  Figure  1),  and 
betvv'een  lexical -phonological  items  and  en¬ 
tities  or  relations  conceived  in  the  world. 

(4)  The  ability  to  perform  the  linguistic  bind¬ 
ing  operation  between  lexical  items  and 
syntactic  slots  (in  a  syntactic  construction) 
following  both  the  perceived  conceptual 
analogy  (at  the  semantic  level)  and  the 
grammatical  conventions  of  the  language 
Oanguages  differ  in  the  type  of  linguistic 
binding  they  permit,  or  prefer,  and  how  they 
mark  them,  as  will  be  discussed  in  the  next 
section).  This  last  operation  is  the  basic 
operation  underlying  sentence  (and  proba¬ 
bly  discourse)  generation. 

The  first  two  skills  defined  above  are  gen¬ 
eral  analogy  making  skills  (as  discussed,  with 
some  variations  in,  for  example,  Hofstadter  et 
al,  1995,Holyoak  &Thagard,  1994,  Indurkhya 
1992;  the  first  skill  of  ab.straction  parallels  for 
example  Hofstadter’s  notion  of  *‘essence-ex- 
traction”,  proposed  to  be  the  first  stage  in  anal¬ 
ogy  making).  The  third  and  fourth  skills  are lin- 


Figure  2. 
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guistic  skills  (the  third  skill,  and  in  particular 
the  details  of  the  forrn-nnieaning  associations  in 
each  language;  has  been  a  main  topic  of  study 
in  the  Cognitive  Linguistics  literature).  All  four 
skills  require  the  ability  to  acc^  conceptual 
structures  (linguistic  and  non-linguistic)  in 
memory,  and  map  (or  bind)  these  structures 
onto  one  another  ^ . 

3,  MAPPING  CONFIGURATIONS  AND 
GRAMMAR 

In  the  previous  section  we  discussed  one 
example  of  analogical  mapping  in  sentence  pro¬ 
duction.  An  analogy  is  first  perceived  between 
a  Conceptual  abstraction  of  a  complex  con¬ 
ceived  event  in  the  world  and  the  semantic 
structure  of  one  of  the  language’  s  syntactic  con¬ 
structions.  This  analogy  leads  to  the  linguistic 
expression  of  the  event  by  means  of  the  syn¬ 
tactic  construction.  The  sentence  generation 
operation  is  based  on  linguistic  association  of 
lexical  items  and  syntactic  slots  in  the  construc¬ 
tion,  following  the  perceived  conceptual  anal¬ 
ogy  and  mapping. 

The  conceptualization  and  communication 
of  a  complex  conceived  event  as  an  instance  of 
a  simple  event  structure  (e.g.,  the  transitive 
event  schema)  has  clear  cognitive  advantages. 
This  process  of  conceptual  integration  (Fau- 
connier  &  Turner,  in  press)  facilitates  the  con¬ 
ceptual  manipulation  and  categorization  of  the 
event,  and  its  storage  in  memory.  It  also  en¬ 
ables  easier  communication  (a  simple,  short 
sentence  can  trigger  the  whole  event  sequence 
in  the  hearer’s  mind).  From  a  linguistics  point 
of  view,  this  process  allows  reusing  a  small  set 
of  grammatical  forms  (syntactic  constructions) 
for  the  expression  of  infinite  number  of  novel, 
complex  events. 


*  For  discussion  of  mapping  and  binding  operations, 
see,  for  example,  Fauconnier’s  study  on  mapping  in  lan¬ 
guage  and  thought  (1997),  Damasio  (1989)  and  Sahstri  et 
al,  (1993)  on  binding  and  convergence  zones,  and  Crush  & 
Mandelblit  (1998),  Mandelblit  &  Zachar  (1998),  and  Peti- 
tot  (1995)  on  their  interdisciplinary  links. 


If  this  process  is  indeed  s6  useful  cognitive¬ 
ly,  then  it  would  be  only  natural  if  formal  gram¬ 
matical  systems  would  evolve  to  formally  mark 
such  analogical  mapping  operations  in  order  to 
systematize  and  facilitate  their  communication. 
Research  on  grammatical  mapping  and  integra¬ 
tion  sugjgests  that  this  is  indeed  the  case. 

Consider,  for  example,  the  active-passive 
grammatical  dichotomy  found  cross-^linguisti- 
cally.  What  this  dichotomy  really  tells  the  hearer 
is  which  participant  in  the  conceived  event  has 
been  linguistically  bound  with  (and  expressed 
by)  the  subject  slot  of  the  integrating  syntactic 
construction.  The  active  form  typically  tells  the 
hearer  that  an  agent  (a  source  of  energy)  in  a 
conceived  event  has  been  bound  onto  the  sub¬ 
ject  slot  of  the  syntactic  construction,  and  the 
passive  form  tells  the  hearer  that  a  patient  (an 
affected  entity)  has  been  mapped  onto  the  sub¬ 
ject  slot  of  the  syntactic  construction. 

Figure  3  illustrates  the  difference  in  map¬ 
ping  configuration  underlying  the  active  sen¬ 
tence  the  dog  is  eating,  and  the  passive  coun¬ 
terpart  t/ze  dog  is  eaten.  The  active-passive  ver¬ 
bal  grammatical  forms  (be  V-ing  vs.  be  V-en) 
define  the  different  mapping  configurations, 
thereby  providing  the  hearer  with  instructions 
on  how  to  link  (map)  the  partial  information 
provided  by  the  lexical  items  in  the  sentence 
(‘dog’,  ‘eat’)  to  the  actual  structure  of  the  com¬ 
municated  event. 


Figure  3.  The  active-passive  mapping  configurations 
(schematic  description). 
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In  Mandelblit  (1997,  ms.).  Hebrew  verbal 
morphological  constructions  (binyanim)  arc 
analyzed  in  detailed.  It  is  suggested  that  each 
morphological  construction  marks  a  particular 
type  of  mapping  configuration  between  a  con¬ 
ceived  event  (typically  a  causal  sequence  of 
events)  and  a  syntactic  construction.  The  mor¬ 
phological  construction  marks:  (I)  which  par¬ 
ticipant  in  the  conceived  causal  event  (e.g.,  the 
causal  force  or  the  affected  entity)  has  been 
mapped  onto  the  subject  slot  of  Ae  syntactic 
construction  (as  in  the  active-passive  contrast 
described  in  Figure  3);  (2)  which  predicate  in 
the  conceived  event  (the  causing  or  effected 
predicate)  has  been  mapped  onto  the  verbal  slot 
of  the  syntactic  construction.  A  summary  of  the 
mapping  configurations  associated  with  each 
of  the  seven  principal  morphological  binyanim 
in  Hebrew  is  given  in  Figure  4. 

English,  in  contrast  to  Hebrew,  does  not 
possess  a  grammatical  system  as  rich  as  the 
Hebrew  morphological  binyanim  system  to 
mark  the  link  between  the  main  verb  in  a  sen¬ 
tence  and  the  structure  of  the  communicated 
event.  For  example,  the  verb  ‘ran’  in  Mary  ran 
around  the  block  and  Mary  ran  the  dog  around 
the  block  looks  exactly  the  same,  even  though 
in  the  first  sentence  ‘ran*  refers  to  the  activity 
of  the  subject  ‘Mary*  (the  sole  energy  source 
of  the  running  action),  while  in  the  second  sen¬ 
tence  ‘ran’  primarily  refers  to  the  activity  of 
the  patient  ‘the  dog’  (while  Marry,  who  made 
the  dog  run,  was  not  necessarily  running  her- 
selO.  The  verbal  form  ‘ran*  in  both  sentences 
denotes  only  a  type  of  motion  activity  (of  Mary 
or  the  dog),  but  not  the  relative  role  this  activi¬ 
ty  plays  within  the  general  structure  of  the  com¬ 
municated  event. 

What  types  of  linguistic  mapping  configu¬ 
rations  from  a  conceived  event  onto  a  syntactic 
construction  are  found  in  English  (where  the 
semantics  of  the  syntactic  construction  is  tak¬ 
en  to  be  analogous  to  an  abstracted  structure  of 
the  communicated  event)? 

Fauconnier  &  Turner  (1996)  analyze  the 
mapping  configurations  underlying  the  use  of 
the  English  Caused-Motion  syntactic  construc¬ 
tion,  studied  by  Goldberg  (1^5).  The  form  of 


the  English  Caused-Motion  construction  is  (NP 
V  NP  PP)  ( =  SUB  V  OBJ  OBL ),  and  its  asso¬ 
ciated  semantic  schema,  as  Goldberg  suggests, 
is  of  a  “caused  motion”  event  (*X  causes  Y  to 
move  Z’).  Examples  of  this  construction  in¬ 
clude: 

(1)  The  audience  laughed  the  poor  actor  off 
the  stage. 

(2)  Monica  trotted  the  horse  into  the  stable. 

(3)  The  commander  let  the  tank  into  the 
compound. 

(4)  Paul  hammered  the  nail  into  the  door. 

In  each  of  the  sentences  (1  -4),  a  whole  caus¬ 
al  sequence  of  events  [(X  act]  cause  [Y  move 
in  direction  Zj]  is  mapped  (and  conceptually 
integrated)  into  the  caused-motion  syntactic 
construction  fNP  V  NP  PP],  based  on  perceived 
analogy  between  the  abstract  structure  of  the 


Figure  4,  The  mapping  configuration!:  marked  by  the 
different  Hebrew  werbat  morphohgicat  binyanim 
(MandethVa,  1997). 
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conceived  event  and  the  caused-motion  seman¬ 
tic  schema  associated  with  the  Syntactic  form. 

In  each  sentence,  different  aspects  of  the 
conceived  causal  event  sequence  are  mapped 
onto  the  verbal  slot  of  the  construction.  In  ex¬ 
ample  (1),  the  verb  laugh  specifies  the  agent’s 
causing  action.  In  2,  the  verb  trot  specifies  the 
motion  of  the  affected  patient  (the  horse).  In 
(3)  the  verb  let  does  not  specify  neither  the 
agent’s  causing  action,  nor  the  patient’s  mo¬ 
tion,  but  rather  the  causal  link  (force  dynam¬ 
ics)  between  the  (unknown)  commander’s  ac¬ 
tion  and  the  tank’s  motion.  In  (4)  the  ham¬ 
mer  specifies  the  tool  used  for  achieving  the 
caused-motion  event.  The  last  mapping  (4)  is 
most  similar  to  the  one  observed  in  the  first 
example  discussed  in  this  paper  {Mary  poisoned 
her  lover),  where  the  verb  poison  describes  the 
means  (substance)  that  the  agent  used  to  affect 
(kill)  the  patient  ^ .  Note  that  nothing  in  the 
English  grammar  marks  to  the  hearer  the  map¬ 
ping  configuration  underlying  each  sentence. 
It  is  up  to  the  hearer  to  reconstruct  the  analogi¬ 
cal  links  between  the  lexical  information  pro¬ 
vided  in  the  sentence  and  a  probable  conceived 
event  in  the  world  ^ . 

While  it  is  possible  to  find  in  each  language 
a  basic  similar  set  of  conventional  mapping 
configurations  (either  marked  grammatically  or 
not),  languages  seem  to  differ  in  which  map¬ 
ping  configurations  are  ‘favored’  (used  more 
often  than  others)  in  everyday  speech  (for  im¬ 
plications  of  these  differences  to  translation,  see 
Mandelblit  1995,  1997). 


But  whatever  the  conventions  are,  speakers 
are  able  to  come  with  novel  surprising  mappings, 
as  exemplified  in  the  following  caused-motion 
sentence  (from  Fauconnier  &  Turner,  1 996): 

(5)  The  spy  Houdinied  the  drums  out  of  the 
compound. 

The  analogy  in  example  5  between  the  high- 
level  structure  of  the  conceived  event  and  the 
semantics  of  the  caused-motion  schema  (an  anal¬ 
ogy  which  led  the  speaker  to  express  the  con¬ 
ceived  event  through  the  caused-motion  con¬ 
struction)  is  itself  quite  straightforward  (as  in 
examples  1-4).  What  makes  example  5  look  so 
creative  is  the  unconventional  underlying  map¬ 
ping  configuration:  the  binding  of  ‘Houdini’  to 
the  verbal  slot,  and  what  role  Houdini  plays  in 
the  conceived  event .  We  will  not  go  now  into 
the  details  of  the  mapping  (we  leave  it  for  the 
reader),  but  what  examples  such  as  5  show  is 
that  the  choice  of  a  syntactic  construction  for 
expressing  an  event  as  a  result  of  perceiving 
structural  analogy  between  the  event  and  the 
construction’s  semantics  is  just  the  first  creative 
stage  in  sentence  generation.  Then,  many  dif¬ 
ferent  linguistic  mappings  may  be  used  between 
the  two  analogous  structures  -  some  are  en¬ 
trenched,  and  often  marked  grammatically  (as 
in  the  use  of  Hebrew  binyanim.  Figure  4),  and 
others  are  completely  novel  and  unpredictable 
(thereby  requiring  special  effort  during  the  pro¬ 
cess  of  interpretation).  Current  computational 
models  of  analogy  and  language  processing  can 
model  the  very  entrenched  linguistic  mappings, 
but  do  not  account  yet  for  the  real  creative  ones. 


^  The  use  of  verbs  such  as  hammer  and  poison  in  En¬ 
glish  has  become  so  entrenched  that  today  these  veibs  are 
viewed  as  denoting  a  whole  causal  event  themselves  rather 
than  just  the  tool  or  substance  used  to  achieve  an  effect. 
Note  however  that  when  these  verbs  first  emerged  in  the 
language  (through  a  so-called  “verbal  derivation”  operation) 
they  reflected  a  particular  type  of  mapping  configuration  from 
events  onto  syntactic  forms  that  speakers  preferred  to  use. 
Similar  new  mapping  configurations  are  still  created  every¬ 
day  by  speakers,  and  it  is  the  goal  of  cognitive  linguists  to 
capture  and  describe  this  productive  operation. 


^  Goldberg  (1995:65)  defines  a  hierai'chy  of  possible 
relations  between  the  semantics  designated  by  a  verb  (V) 
and  the  semantics  designated  by  the  syntactic  construction 
(C)  it  instantiates.  By  doing  so,  Goldberg  defines  in  fact  the 
various  mapping  configurations  available  in  English  between 
what  the  verb  designates  in  the  conceived  event  and  the 
analogous  semantics  of  the  construction.  The  hierarchy 
Goldberg  defines  is  as  follow:  1 .  V  is  a  subtype  of  C.  2.  V 
designates  the  means  of  C.  3.  V  designates  the  result  of  C. 
4.  V  designates  a  precondition  of  C.  5.  (to  a  very  limited 
extent)  V  may  designate  the  manner  of  C,  means  of  identi¬ 
fying  C,  or  the  intended  result  of  C. 
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4.  A  SHORT  NOTE  ON  LANGUAGE 
ACQUISITION 

The  discussion  in  the  previous  sections  sug¬ 
gests  that  an  essential  cognitive  skill  for  sentence 
generation  is  analogy  making:  that  is,  abstrac¬ 
tion  and  mapping.  An  interesting  question  is  to 
what  extent  young  children  (who  acquire  their 
first  language)  already  possess  these  skills. 

Consider,  for  example,  the  following  exam¬ 
ple  from  Berman's  (1982)  study  on  the  acquisi¬ 
tion  of  Hebrew  binyanim  by  children.  At  the  age 
of  two-year-old,  Israeli  children  still  fail  to  use 
the  correct  moiphological  verbal  form  (suggest¬ 
ed  to  mark  underlying  mapping  configurations, 
Figure  4).  Marked  improvement  is  shown  only 
at  the  age  of  three  to  four  year  old.  The  data  from 
Berman  suggests  however  that  two  and  half  year 
old  children  already  master  the  underlying  ana¬ 
logical  (abstraction  and  mapping)  operations  re¬ 
quired  for  sentence  generation,  as  discussed  be¬ 
low.  The  children  only  fail  to  mark  the  mapping 
by  the  correct  grammatical  form. 

In  (6)  is  an  example  of  a  sentence  generat¬ 
ed  by  Berman's  own  child  around  the  age  of 
2;6  (similar  examples  in  English  are  reported 
in  the  CHILDE  archive): 

(6)  ima  oxelet  oti  hayom 

mother  Is-eating  me  today 

(meaning:  ‘mother  is  feeding  me  today*) 

Sentence  (6)  is  syntactically  correct  (using 
the  simple  transitive  syntactic  construction), 
with  appropriate  word  order  and  case  marking 
of  nouns.  TTie  only  error  in  (6)  is  that  the  child 
used  the  wrong  morphological  form  for  the  verb 
yielding  the  form  oxelet  (‘eating’)  rather  than 
ma  ?axila  (‘feeding').  This  mistake  suggests  that 
the  sentence  is  not  a  simple  imitation  of  adult’s 
speech  (the  child  has  probably  never  heard  the 
combination  ‘eat  me’),  burrather  a  real  creative 
production  of  the  child. 

The  event  in  the  world  involves  some  com¬ 
plex  links  between  the  mother  and  the  child  (the 
mother  prepares  food,  then  brings  it  to  the 
child's  mouth,  thereby  enabling  the  child  to  eat). 


But  at  a  higher  abstract  level,  the  child  correct¬ 
ly  perceives  this  event  as  analogous  to  (or  an 
instance  of)  the  basic  transitive  schema  (‘Agent 
affects  Patient’),  thereby  choosing  the  Transi¬ 
tive  syntactic  construction  to  express  the  event. 

The  mapping  performed  by  the  child  is  also 
correct.  Tlie  child  perceives  the  mother  as  the 
source  (agent)  of  the  causal  event  and  herself 
as  the  affected  patient,  and  thus  maps  the  moth¬ 
er  to  the  subject  slot  and  herself  to  the  object 
slot  in  the  transitive  syntactic  construction.  Into 
the  verbal  slot  the  child  maps  the  effected  ac¬ 
tivity  of  the  patient  (herself)  -  ‘eating’  (rather 
than,  say,  the  mother’s  action).  This  mapping 
itself  is  possible  in  Hebrew  (as  well  as  in  En¬ 
glish,  as  in  /  walked  the  dog,  where  walking 
refers  to  the  activity  of  the  patient  -  the  dog). 
The  only  error  the  child  made  is  in  the  mor¬ 
phological  marking  of  the  chosen  mapping  (by 
hifii  morphology,  sec  Figure  4). 

To  sum,  examples  such  as  (6)  suggest  that 
a  2.5  year  old  child  already  masters  the  basic 
cognitive  skills  (identified  in  section  2)  nece.s- 
sary  for  sentence  generation  (abstraction  and 
mapping).  Errors  in  production  at  this  age  may 
occur  only  due  to  lack  of  command  of  the  gram¬ 
matical  markers  fonhe^e  conceptual  operations 
(as  suggested  for  Hebrew^  moqphological  bin- 
yanim  above). 
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Analogy-making  has  been  frequently  stud¬ 
ied  in  laboratory  and  on  the  basis  of  “well  de¬ 
fined”  tasks,  built  towards  the  end  of  analyzing 
specific  cognitive  mechanisms.  Such  experi¬ 
ments  lead  to  the  proposal  of  interesting  theo¬ 
ries  and  models  of  analogical  reasoning,  as  for 
instance  the  SME  model  proposed  by  Centner 
(1989)  or  the  approach  of  Holyak  and  Thagard 
(1989).  Our  objective  is  in  some  way  different 
since  we  wish  to  study  analogy-making  on  the 
basis  of  real-world  cognitive  activities  and,  es¬ 
pecially,  in  an  area  in  which  analogies  can  play 
a  very  important  role:  design  activities. 

In  non  routine  design  activities,  designers 
have  to  create  an  innovative  product  as  well  as 
to  satisfy  certain  specifications.  Though  certain 
designers  wish  to  point  out  the  creative  and  ar¬ 
tistic  part  of  their  activities  (and,  for  some  of 
them,  to  keep  it  in  some  way  “mysterious”), 
we  believe  that  their  creativity  can  be,  at  least 
partially,  explained  by  analogical  reasoning,  in 
accordance  to  certain  research  works  -  even  not 
directly  related  to  design  -  such  as  the  ones  of 
Boden,  1990,  Hofstadter  1985,  or  Kolodner, 
1993.  Therefore,  we  settled  an  experimental  sit¬ 
uation  that  should  induce  non  routine  design 
activities  as  well  as  allow  us  to  analyze  analo¬ 
gy-making  by  designers  and,  especially,  the 
effect  of  classical  parameters  associated  to  an¬ 
alogical  reasoning  (such  as  intra-  vs.  interdo¬ 
main  sources).  We  first  characterize  more  pre¬ 
cisely  design  problem-solving  and  suggest  the 
role  of  analogy  in  it.  Then,  we  describe  our  ex¬ 


perimental  situation,  present  some  hypotheses 
we  had  as  well  as  the  results  we  obtained.  Such 
results  will  be  finally  discussed  with  regard  to 
certain  theoretical  approaches  of  the  analogi¬ 
cal  reasoning. 

1.  DESIGN  PROBLEM-SOLVING  AND 
ANALOGY-MAKING 

In  Cognitive  Psychology,  design  activities 
are  described  as  consisting  in  specific  problem¬ 
solving,  design  problems  being  both  ill  defined 
and  open-ended.  Design  problems  are  consid¬ 
ered  ill-defined  because  designers  have,  initial¬ 
ly,  only  an  incomplete  and  imprecise  mental 
representation  of  the  design  goals  or  specifica¬ 
tions  (Eastman,  !969;  Simon,  1973).  Design 
problems  are  also  considered  to  be  open-ended 
because  there  is  usually  no  single  correct  solu¬ 
tion  for  a  given  problem,  but  instead  a  variety 
of  potential  solutions  (Fustier,  1989).  These 
characteristics  lead  to  design  processes  involv¬ 
ing  an  iterative  dialectic  between  prohlem- 
framing  and  problem-solving  (Schoen,  1983; 
Simon,  1995).  During  problem-framing,  de¬ 
signers  refine  design  goals  and  specifications 
and,  thus,  refine  their  mental  representation  of 
the  problem.  During  problem-solving,  design¬ 
ers  elaborate  solutions  and  evaluate  these  solu¬ 
tions  with  respect  to  various  criteria  and  con¬ 
straints  (Bonnardel,  1992). 

Our  general  hypothesis  is  that  creativity, 
which  is  required  for  the  design  of  new  objects, 
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is  dependent  on  the  mental  images  that  the  de¬ 
signer  can  evoke,  especially  during  the  prob¬ 
lem-framing  phase.  Such  images  may  be  relat¬ 
ed  to  objects  that  are  more  or  less  familiar  to 
the  designer.  More  precisely,  we  believe  that 
these  objects  can  play  the  role  of  “sources”  (or 
“bases”)  for  an  analogical  reasoning  and,  thus, 
allow  the  designer  to  transfer  some  of  the  ob¬ 
jects’  properties  to  elaborate  a  target-solution 
(or  target-elements  of  solution)  for  the  design 
problem  at  hand.  Though  some  observations  of 
analogy-making  have  been  made  during  design 
activities  (see,  for  instance,  D^tienne,  1991,  and 
Visser,  1996),  we  need  to  analyze  more  pre¬ 
cisely  the  analogical  reasoning  in  design  activ¬ 
ities,  to  understand  when  designers  develop  this 
type  of  reasoning,  how  they  exploit  it  and  trans¬ 
fer  knowledge  from  one  domain  to  another,  etc. 

Since  creativity  in  design  activities  seems  to 
depend  on  the  designers*  mental  representation, 
the  study  we  are  going  to  present  specifically  fo¬ 
cus  on  the  evocation  part  of  analogical  reasoning, 
and  not  on  the  mapping  and  transfer  parts. 

2.  EXPERIMENTAL  SITUATION 

The  experimental  situation  we  settled  al¬ 
lowed  us  to  control,  to  some  extent,  the  sourc¬ 
es  the  designers  can  take  into  account  in  order 
to  identify  relevant  properties  for  the  target  and, 
therefore,  construct  their  own  representation  of 
the  object  to  design. 

We  asked  10  volunteers  students  in  Applied 
Art  (in  a  technical  school  of  Marseille,  France) 
to  design  a  new  product.  Though  these  students 
are  not  very  experienced  designers,  they  ac¬ 
quired  knowledge  and  skills  in  design  and  are 
really  involved  in  design  projects  -  which, 
though  less  complex  than  those  experts  deal 
with,  present  the  main  characteristics  of  pro¬ 
fessional  design  projects.  Therefore,  we  will 
refer  to  these  students  in  design  as  “designers”. 

The  design  problem  they  had  to  solve  was 
defined  in  collaboration  with  their  professor  of 
Applied  Art,  in  order  to  have  a  presentation  in 
accordance  with  the  one  used  for  the  design 
problems  they  usually  have  to  deal  with.  There¬ 
fore,  they  were  provided  with  a  schedule  of 


conditions  consisting,  first,  in  a  scenario  de¬ 
scribing  both  the  object  to  design  and  its  use 
(see  Figure  1)  and,  secondly,  in  a  reminder  of 
the  main  requirements  to  satisfy. 


The  object  to  be  designed  was  intended  to 
be  used  in  a  Parisian  “cyber-caf6”.  It  should  be 
a  particular  stool  with  a  contemporary  design 
in  order  to  be  attractive  for  young  customers. 
Such  stools  should  allow  the  user  to  have  a  good 
sitting  position,  holding  the  back  upright.  To¬ 
wards  this  end,  the  users  should  put  their  knees 
on  a  support  intended  to  this  function.  In  addi¬ 
tion,  these  stools  should  allow  the  users  to  re¬ 
lax,  by  offering  them  the  possibility  to  rock. 


Figure  /.  Brief  description  of  the  object  to  design. 

Even  for  people  who  are  not  specialized  in 
design,  reading  this  description  involves  the 
evocation  of  objects  we  already  know.  Simi¬ 
larly,  the  designers  can  evoke  sources  to  better 
understand  the  object  to  be  designed  and,  even¬ 
tually,  transfer  certain  properties  of  the  sourc¬ 
es  to  the  target.  In  order  to  identify  the  sources 
evoked  by  the  designers  who  participated  in  our 
study,  we  asked  them  to  think  aloud  -  a  meth¬ 
od  frequently  used  to  study  design  activities. 

The  designers*  verbalizations  as  well  as  their 
graphical  activities  were  video  recorded.  Then, 
die  verbalizations  were  transcribed  and  matched 
with  the  drawing  made  by  the  designers. 

The  experiment  was  50  minutes  long  for 
each  designer.  This  duration  was  realistic  to 
realize  a  rough  draft  of  the  object  to  design. 
More  precisely,  it  consisted  of  two  phases  of 
25  minutes  each. 

1.  During  the  first  25  minutes,  two  experi¬ 
mental  conditions  were  settled  (with  5  design¬ 
ers  in  each  condition): 

-  a  free  condition,  in  which  the  designers 
could  freely  solve  the  problem  and  spontane¬ 
ously  evoke  sources  (known  objects)  they  could 
refer  to; 

-  a  guided  condition,  in  which  we  present¬ 
ed  to  the  designers  names  of  objects  that  could 
play  the  role  of  sources.  Two  of  these  potential 
sources  for  an  analogical  reasoning  were  con- 
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Sources 

Intradomain 

Interdomain 

Studied 

Never  studied 

’’nomadic*'  stool 

rocking-chair 

logotype 

canoe 

Table  /.  CkaracterhticK  of  the  potential  sources 
proposed  to  the  designers. 

sidered  intradomain,  in  the  sense  that  they  were 
belonging  to  the  category  of  “seats”.  Two  oth¬ 
er  potential  sources  were  considered  as  inter¬ 
domain,  since  they  refered  to  objects  very  dif¬ 
ferent  from  seats.  In  addition,  one  intradomain 
object  and  one  interdomain  object  had  been 
studied  by  the  designers  during  their  Art  Ap¬ 
plied  class,  whereas  the  two  other  objects  had 
never  been  studied  (see  Table  1).  Each  of  the 
names  of  objects  were  written  on  folders  and 
delivered  to  the  designers  in  a  random  order. 

In  this  first  phase  of  the  experiment,  we 
chose  to  provide  the  designers  with  only  namex 
of  objects  -  and  not  graphical  representations 
of  specific  objects  or  “instance.s”. 

These  names  refer  to  categories  of  objects 
and  may  lead  the  designers  to  infer  what  gener¬ 
al  principle  or  feature(s)  can  be  extracted  from 
this  class  of  objects  as  relevant  for  the  object  to 
design.  For  instance,  the  designers  may  reflect 
ort  what  could  be  relevant  on  a  canoe  or  on  a 
logotype  for  designing  the  specific  stool  de¬ 
scribed  in  the  schedule  of  conditions. 

2.  During  the  following  25  minutes,  the 
designers  of  the  two  groups  were  in  a  similar 
situation:  they  had  both  names  and  a  graphi¬ 
cal  representation  of  each  type  of  potential 
source,  what  we  could  call  an  instance  of  each 
category  defined  by  the  names  (see  annex  1). 

Designers  who  belonged  to  the  “guided” 
group  could  directly  open  the  folders  they  had 
been  provided  with,  to  find  out  the  specific 
graphical  representations.  During  this  second 
phase,  designers  who  belonged  to  the  “free” 


group  were  provided  with  both  the  names  and 
the  graphical  representations. 

Contrary  to  the  sources’  names,  their  graph¬ 
ical  representations  facilitate  more  the  identi¬ 
fication  of  relevant  principles  that  the  design¬ 
ers  can  transfer  to  the  object  to  design.  Seeing 
instances  of  objects  may  allow  the  designers  to 
transfer  more  directly  relevant  features  to  the 
object  to  design. 

This  experimental  situation  will  allow  us 
to  determine  the  influence  of  potential  sources 
according  to  the  moment  of  their  presentation 
in  the  course  of  the  design  activity.  It  will  also 
allow  us  to  compare  the  influence  of  the  names 
of  objects  presented  alone  with  regard  to  a  pre¬ 
sentation  of  both  names  and  instances  of  sourc¬ 
es.  However,  since  we  will  only  analyze  the 
evocation  part  of  analogical  reasoning,  our  anal¬ 
ysis  will  be  conducted  on  “potential”  sources 
for  an  analogical  reasoning.  Indeed  if  some  of 
them  effectively  lead  to  a  transfer  of  relevant 
features  to  the  target,  other  evoked  sources  can 
be  more  or  less  rapidly  abandoned  by  the  de¬ 
signers. 

3.  HYPOTHF.SES 

•  Hypothesis  1: 

Our  first  hypothesis  is  linked  to  the  progress 
of  the  design  problem-solving.  We  expect  the 
role  of  sources  to  be  more  or  less  important 
according  to  the  current  objectives  of  (he  de¬ 
signers.  More  precisely,  the  construction  by  the 
designers  of  a  mental  representation  of  the  ob¬ 
ject  to  design  can  take  place  more  during  the 
beginning  of  the  design  problem-solving. 
Therefore,  we  expect  that  the  designers,  what¬ 
ever  experimental  group  they  belong  to,  will 
evoke  less  sources  as  the  problem-solving 
progresses. 

•  Hypothesis  2: 

Our  second  hypothesis  is  based  on  previous 
research  worlds  conducted  in  Cognitive  Psychol¬ 
ogy  and,  in  particular,  on  the  identification  in 
various  domains  of  a  “functional  fixation”  (sec 
Weisberg,  1988,  or,  older,  Luchins,  1942).  For 
instance,  certain  studies  on  the  analogical  rea¬ 
soning,  conducted  with  pupils  in  scholar  situa¬ 
tion,  showed  that  they  tend  to  systematically  re- 


338 


Analogies  in  design  activities 


produce  what  their  teachers  showed  them  as  ex¬ 
amples  (Friemel  &  Richard,  1988).  Such  a  fixa¬ 
tion  has  also  been  identified  in  design  activities 
as  “design  fixation”.  Thus,  Jansson  and  Smith 
(1989,  quoted  in  Purcell  &  Gero,  1991)  showed 
that  the  presentation,  as  examples,  of  graphical 
representations  of  objects  that  could  potentially 
fit  requirements  of  a  design  problem,  lead  de¬ 
signers  (and,  especially,  professional  designers) 
to  reproduce  numerous  features  of  these  objects, 
comprising  features  irrelevant  to  the  task  at  hand. 

In  accordance  to  these  previous  results,  in  our 
study,  the  designers  who  belong  to  the  guided 
group  could  focus  on  the  potential  sources  we 
suggested  them.  Especially,  during  the  first  phase, 
the  proposal  of  names  of  objects  could  limit  the 
space  of  objects  that  designers  can  evoke  as  sourc¬ 
es  for  a  design  problem-solving  based  on  analog¬ 
ical  reasoning.  This  implies  thatf/ie  designers  who 
belong  to  the  guided  group  would  evoke  less 
sources  than  the  designers  of  the  free  group.  How¬ 
ever,  we  may  also  observe  eventual  differences 
between  the  presentation  of  potential  sources 
through  names  and  through  instances. 

•  Hypothesis  3: 

Though  not  induced  by  previous  research 
works,  bur  third  hypothesis  appears  as,  partial¬ 
ly,  in  contradiction  with  the  previous  one,  but 
allows  us  to  consider  more  precisely  the  influ¬ 
ence  of  interdomain  sources. 

During  the  first  phase,  the  names  of  poten¬ 
tial  sources  we  presented  to  the  designers  who 
belong  to  the  guided  group  could,  as  a  *'snow~ 
ball”  effect,  lead  these  designers  to  consider 
more  sources  than  the  designers  of  the  free 
group.  The  suggestion  we  made  of  potential 
sources  a  priori  independent  of  the  object  to 
design  shows  to  the  designers  that  they  can 
evoke  sources  that  do  not  belong  to  the  “seat” 
category  and  that  such  a  process  can  present  an 
interesting  heuristic  power. 

4.  RESULTS 

The  previous  hypotheses  are  all  based  on 
the  number  of  sources  evoked  by  the  design¬ 
ers.  Therefore,  the  results  we  present  are  quan¬ 
titative  but  they  are  also  related  to  qualitative 


features,  such  as  the  moment  of  source  evoca¬ 
tion  with  regard  to  the  design  problem-solving 
and  the  nature  of  the  evoked  sources  (intra-  vs. 
interdomain).  We,  now,  just  present  our  results 
and  we  will  comment  on  them  in  the  section  4. 

4.1  Influence  of  Problem-Solving  Phases 

The  analysis  of  the  evocation  of  sources  by 
designers  was  first  conducted  with  regard  to  the 
two  problem-solving  phases  we  constructed.  It 
showed  results  in  accordance  with  hypothesis  1 : 

•  The  designers  evoke  a  lot  more  sources  dur¬ 
ing  the  first  25  minutes  than  later  :  a  total 
of  32  evoked  sources  during  the  first  phase 
vs.  only  8  during  the  second  phase. 

•  Moreover,  it  is  important  to  point  out  that 
such  an  effect  appears  for  designers,  what¬ 
ever  group  they  belong  to: 

-  in  the  free  group,  86%  of  the  sources  were 
-  evoked  during  the  first  phase  (which  corre¬ 
sponds  to,  respectively,  6  sources  vs.  only  1 ); 

-  in  the  guided  group,  79%  of  the  sources  were 
evoked  during  the  first  phase  (which  corre¬ 
sponds  to,  respectively,  26  sources  vs.  7). 

4.2  Influence  of  Experimental  Conditions 

'(  ■  .  .  "  •  .  » 

The  analysis  of  the  evocation  of  sources 
with  regard  to  the  two  experimental  conditions 
shows  a  result  opposite  to  the  hypothesis  2: 

•  The  designers  who  belong  to  the  guided 
group  evoke,  in  mean,  riiore  sources  than 
the  designers  of  the  free  group:  respective¬ 
ly,  a  total  of  33  sources  vs.  7,  which  corre¬ 
sponds  in  mean  to  6.6  sources  by  designer 

^  vs.  1 .4  (p  <  .05). 

•  This  effect  appears  in  the  two  phases  of  the 
experiment  but  is  higher  in  the  first  phase: 

-  during  the  1st  phase,  26  sources  were 

evoked  in  the  guided  condition  vs.  6  in  the 
free  condition;  j 

-  during  the  2nd  phase,  7  sources  were 
evoked  in  the  guided  condition  vs.  1  in  the 
free  condition.  » 
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4J.  Nature  of  the  Evoked  Sources 

The  analysis  of  the  nature  of  the  sources 
evoked  by  the  designers  of  the  two  group  shows 
results  in  accordance  with  the  hypothesis  3, 
about  a  “snowball  effect”  of  the  suggestion  of 
interdomain  sources: 

The  designers  who  belong  to  the  guided 
group  evoke,  in  mean,  more  interdomain  sourc¬ 
es  than  the  designers  of  the  free  group:  respec¬ 
tively  3.8  inlerdomain  sources  by  designer  vs. 
0.2  (p  <  .05).  Therefore,  it  appears  that  quite  all 
the  sources  evoked  by  the  designers  of  the  free 
group  are  intradomain  whereas  the  tendency  is 
opposite  for  the  designers  of  the  guided  group 
(see  Table  2). 

5.  DISCUSSION 

We  comment  on  our  main  results  with  re¬ 
gard  to  the  hypotheses  we  formulated  for  this 
experiment  as  well  as  with  regard  to  certain 
theoretical  approaches  of  the  analogical  reason¬ 
ing. 

5.1  General  Interpretation  of  the  Evocation 
of  Potential  Sources 

The  results  we  obtained  show  that  design¬ 
ers  evoked  a  lot  more  sources  during  the  first 
phase  than  during  the  second  one.  Moreover, 
we  observed  that  the  designers  who  belonged 
to  the  guided  group  evoked,  during  the  first 
phase,  a  lot  more  sources  than  the  designers  of 
the  free  group.  Such  a  difference  can  be  due  to 


Experimental 
condition 
Mature  of 
evoked  sources 

Free 

condition 

Guided 

condition 

Intradomatn 

6 

14 

Interdomain 

1 

19 

Table  2,  Nature  of  the  evoked  sources  according  to  the 
experimental  conditions. 


the  “snowball”  effect  of  the  potential  sources 
we  suggested  to  the  designers.  Indeed,  we  only 
proposed  4  names  of  sources,  whereas  design¬ 
ers  of  the  guided  sources  evoked  26  sources 
during  the  first  phase  of  the  experiment.  There¬ 
fore,  it  seems  that  the  presentation  of  names  of 
objects,  which  refer  to  categories  of  these  ob¬ 
jects,  has  really  a  facilitating  effect  on  the  evo¬ 
cation  process  (some  interpretations  of  this  fact 
will  be  proposed  in  section  4.2). 

Moreover,  again  about  the  design  problem¬ 
solving  phases,  we  observed  that  the  presenta¬ 
tion,  for  the  free  group  and  during  the  second 
phase,  of  the  names  and  instances  of  potential 
sources  did  not  have  such  a  facilitating  effect. 
Indeed,  though  they  were  presented  with  such 
sources,  they  only  evoke  I  source.  Therefore 
the  facilitating  effect  of  sources*  names  appears 
only  at  the  beginning  of  design  problem-solv¬ 
ing.  In  accordance  to  this  interpretation,  the 
guided  group  which  was  provided  with  instanc¬ 
es  during  this  second  phase,  did  not  evoke  ei¬ 
ther  numerous  sources,  contrary  to  what  these 
designers  did  during  the  first  phase. 

To  summarize,  it  seems  that  the  influence 
of  the  potential  sources  we  suggested  to  the  de¬ 
signers  only  appears  when  they  are  provided  with 
names  of  sources  and  during  the  first  phase  of 
the  design  problem-solving.  Indeed,  at  the  be¬ 
ginning,  designers  are  more  involved  in  the  con¬ 
struction  of  a  mental  representation  of  the  ob¬ 
ject  to  design  (i.c.,  the  problem-framing),  where¬ 
as,  later,  they  are  involved  in  more  detailed  prob¬ 
lem-solving  and  graphical  representation  of  this 
object.  However,  a  third  experimental  condition 
should  have  been  constructed  to  decide  between 
the  two  previous  parameters  (names  and  presen¬ 
tation  at  the  beginning)  which  one  has  the  more 
important  effect:  in  this  last  condition,  the  de¬ 
signers  would  have  been  provided  directly  at  the 
beginning  with  both  names  and  instances  of  po¬ 
tential  sources.  We,  initially,  planned  to  have  this 
third  experimental  condition,  but  it  appeared  to 
be  impossible  to  settle  due  to  the  quite  limited 
number  of  volunteers  students  who  participated 
in  our  study.  Nevertheless,  we  can  comment 
more  precisely  on  the  influence  of  the  presenta¬ 
tion  of  names  of  potential  sources. 
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5.2  Influence  of  Suggested  Sources*  Names 

on  Spontaneous  Interdomain  Sources 

Our  second  result,  about  the  influence  of  the 
names  of  potential  sources  for  an  analogical  rea¬ 
soning,  differs  from  results  previously  obtained 
in  research  areas  such  as  analogical  reasoning, 
problem-solving  and  design  (especially,  the  re¬ 
sults  from  Friemel  &  Richard,  1988,  and  the  ones 
of  Jansson  &  Smith,  1989).  Indeed,  the  presenta¬ 
tion  of  names  of  potential  sources  to  the  design¬ 
ers  who  belonged  to  the  guided  group,  did  not 
have  an  effect  of  limitation  of  the  space  of  re¬ 
search  of  sources  that  could  contribute  to  solve 
the  design  problem  through  an  analogical  reason¬ 
ing.  On  the  contrary,  these  designers  evoked  a  lot 
more  sources  than  the  designers  of  the  free  group. 
As  already  expressed,  it  shows  a  facilitating  ef¬ 
fect  for  the  evocation  process,  which  can  be  ex¬ 
plained  with  regard  to  two  types  of  interpretations: 

1.  The  effect  of  “design  fixation”  may  be 
dependent  on  the  designers*  level  of  expertise: 
such  an  effect  might  become  higher  as  the  de¬ 
signers  acquire  expertise.  Experienced  design¬ 
ers,  such  as  the  professionals  who  participated 
in  the  study  of  Jansson  and  Smith  (1989),  could 
be  more  influenced  by  the  suggestion  of  ob¬ 
jects  specifically  related  to  the  object  they  have 
to  design  (i.e.  objects  that  directly  belong  to  a 
same  category).  On  the  contrary,  less  experi¬ 
enced  designers,  such  as  the  students  who  par¬ 
ticipated  in  our  study,  could  be  more  influenced 
by  objects  that  are  familiar  to  them,  even  if  these 
objects  are  not  a  priori  directly  related  to  the 
object  to  design.  Other  results  and,  especially, 
the  ones  of  the  study  of  Purcell  and  Gero  (1991) 
are  also  in  favor  of  this  interpretation. 

Such  an  interpretation  seems  to  fit  particu¬ 
larly  design  problem-solving.  As  we  pointed  out 
in  the  characterization  of  design  problems  (at  the 
beginning  of  this  text),  these  problems  are  open- 
ended  and,  thus,  allow  the  designers  to  refer  to 
various  sources.  Therefore,  less  experienced  de¬ 
signers  or  novices  have  the  opportunity  to  evoke 
sources  that  are  familiar  to  them  though  not  di¬ 
rectly  linked  to  the  object  to  design  (the  target). 

2.  The  results  we  obtained  can  also  be  ex¬ 
plained  by  the  nature  of  the  sources  we  sug¬ 


gested  to  the  designers  during  the  1st  phase. 
These  sources  are  presented  as  names  of  ob¬ 
jects,  by  some  way  related  to  the  object  to  de¬ 
sign.  Such  names  reflect  categories  of  objects 
and  may  lead  the  designers  to  think  of  general 
principles  or  features  that  could  be  transfered 
to  the  object  to  design.  Therefore,  the  design¬ 
ers  do  not  focus  on  specific  features  of  instanc¬ 
es.  On  the  contrary,  they  can  extend  their  space 
of  research  and  evoke  a  diversity  of  sources, 
which  will  have  in  common  with  the  object  to 
design  certain  deep  principles,  for  example. 

Such  an  interpretation  appears  compatible 
with  certain  descriptions  of  the  analogical  rea¬ 
soning,  proposed  on  the  basis  of  more  tradition¬ 
al  experiments.  As  Ripoll  ( 1998),  we  can  assume 
the  existence  of  an  abstract  categorization  of 
objects  in  long  term  memory.  More  precisely, 
for  Ripoll  (ibid.),  two  main  types  of  categories 
could  intervene  in  the  analogical  reasoning: 

-  one,  called  structure  tag”  corresponds 
to  the  identification  of  an  analogical  property 
category,  and  is  elaborated  by  the  subjects  from 
the  stmctural  characteristics  of  objects  -  or  what 
we  called  above  deep  principles  (such  as  the 
functioning  principle  of  objects).  This  structure 
tag  would  underly  both  intra-  and  interdomain 
analogical  transfers. 

-  another,  called  '^domain  tag  **,  corresponds 
to  the  identification  of  a  general  semantic  cate¬ 
gory  and  constitutes  a  sort  of  summary  of  the 
surface  properties  of  objects.  It  underlies  specif¬ 
ically  intradomain  analogical  transfers. 

The  third  result  we  obtained  allows  us  to 
deepen  this  analysis:  the  main  part  of  the  sourc¬ 
es  spontaneously'  evoked  by  the  designers  who 
belonged  to  the  guided  group  were  interdomain 
sources;  whereas  the  designers  who  belonged 
to  the  free  group  mainly  evoked  intradomain 
sources.  Therefore,  the  facilitating  effect  on  the 
evocation  process  seems  mainly  due  to  the  prop¬ 
osition  of  interdomain  potential  sources.  For 
instance,  the  suggestion  of  a  canoe  as  potential 
source  shows  to  the  designers  that  they  can  be 


‘By  “spontaneously”  evoked,  we  mean  evoked  in 
addition  to  the  potential  sources  we  suggested. 
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inspired  by  objects,  which  a  priori  seem  very 
far  from  the  object  to  design.  Thus,  the  role 
played  by  the  CSTG  would  become  particular¬ 
ly  important:  the  designers  would  be  less  fo¬ 
cused  on  surface  characteristics  of  the  object 
they  have  to  design,  and  they  could  take  into 
consideration  various  areas  of  objects,  to  look 
for  functioning  principles  common  (at  least, 
partially)  to  the  one  they  wish  to  develop  for 
the  new  object. 

5.3  Compatible  Models  of  Analogical  Reasoning 

Some  results  of  this  study  suggest  two  main 
factors  that  can  influence  the  evocation  of  sourc¬ 
es  by  designers  for  an  analogical  reasoning: 

-  the  goal  of  the  problem  (i.e.,  in  our  study, 
the  object  to  design). 

-  the  designers*  perception  and  mental  rep¬ 
resentation  of  what  can  constitute  potential 
sources  for  an  analogical  reasoning. 

Certain  models  of  the  analogical  reason¬ 
ing  seem  compatible  to  these  suggestions.  Es¬ 
pecially,  we  can  think  of  the  approach  of  Holy- 
ak  and  Thagard  (1989)  takes  into  account  the 
context  and  the  goal  to  reach  during  analogy¬ 
making.  The  importance  of  the  mental  repre¬ 
sentation  of  the  goal  of  the  problem  has  also 
been  pointed  out  by  Wolstencroft  (1989,  quot¬ 
ed  in  Visser,  1989):  for  this  author,  the  analog¬ 
ical  reasoning  would  be  based  on  a  first  stage 
of  “identification**  that  allows  an  appreciation 
of  the  usefulness  of  mapping.  The  Copycat 
model  of  Mitchell  (1989)  is  also  very  interest¬ 
ing  since  it  points  out  the  fact  that  the  target 
and  the  source  have  to  be  perceived  as  playing 
the  same  role  at  a  certain  level  of  abstraction. 

CONCLUSION 

The  analysis  we  performed  was  focused  on 
the  evocation  part  of  analogical  reasoning  in 
design  activities.  Since  such  an  area  of  study 
seems  particularly  interesting  to  better  under¬ 
stand  the  creativity  developed  by  designers,  we 
consider  that  research  works  towards  this  end 
have  to  be  carried  on.  Concerning  our  contri¬ 
bution  to  this  perspective,  some  complementa¬ 
ry  analyses  could  be  performed  on  the  data  we 


gathered  during  the  experimental  situation  pre¬ 
viously  described,  in  order  to  determine  how 
the  designers  use  the  sources  they  evoke  to  solve 
the  problem  at  hand.  Especially,  it  leads  to  the 
study  of  the  evaluation  process,  which  contrib¬ 
utes  both  to  the  analogical  reasoning  and  to  cre¬ 
ativity  (see  Kolodner,  1993).  For  instance,  such 
a  process  can  be  developed  to  find  relevant 
sources  for  an  analogical  reasoning  and  to  de¬ 
termine  which  particular  features  of  the  select¬ 
ed  source  can  be  transferred  to  the  target. 

In  the  case  of  design  activities,  such  stud¬ 
ies  will  contribute  to  explain  how  designers  can 
go  from  the  mental  representation  of  known 
objects  to  the  one  of  the  object  to  design,  until 
a  full  and  precise  graphical  representation  of 
the  designed  object,  at  the  end  of  design  prob¬ 
lem-solving. 
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What  is  common  to  [all  these  games]?  —  Don’t  say:  ‘There  must  be  something  common,  or  they 
would  not  be  called  ‘games’  ”  —  but  look  and  see  whether  there  is  anything  common  to  all.  —  For 
if  you  look  at  them  you  will  not  see  something  that  is  common  to  all,  but  similarities,  relationships. 


and  a  whole  series  of  them  at  that. 

—  Wittgenstein, 

ABSTRACT 

We  propose  here  a  new  approach  to  legal 
thinking  that  is  based  on  principles  of  Gestalt 
perception.  Using  a  Gestalt  interaction  view  of 
perception,  which  sees  perception  as  the  pro¬ 
cess  of  building  a  conceptual  representation  of 
the  given  stimulus,  we  articulatelegal  thinking 
as  the  process  of  building  a  representation  for 
the  given  facts  of  a  case.  We  propose  a  model 
in  which  top-down  and  bottom-up  processes 
interact  together  to  build  arguments  (or  repre¬ 
sentations)  in  legal  thinking.  We  discuss  some 
implications  of  our  approach,  especially  with 
respect  to  modeling  precedential  reasoning  and 
creativity  in  legal  thinking. 

1.  INTRODUCTION 

We  would  like  to  begin  by  first  elaborating 
on  why  we  use  the  expression  ‘legal  thinking’ 
and  what  we  mean  by  it.  When  talking  about 
what  judges,  lawyers,  law  students,  and  lay  peo¬ 
ple  do  when  applying  legal  concepts,  the  con¬ 
vention  is  to  use  the  expression  ‘legal  reason¬ 
ing.’  We  have  eschewed  the  use  of  this  expres¬ 
sion  however,  since  it  gives  the  misleading 
impression  that  we  are  talking  about  inherent¬ 
ly  rational,  indeed  logical,  thinking.  The  typi- 


To  repeat:  don’t  think,  but  look! 

Philosophical  investigations  (emphasis  author’s) 

cal  view  of  law  is  that  it  is  coherent,  internally 
consistent,  logical  and  rational.  Whether  or  not 
this  true,  our  interest  lies  in  exposing  some  of 
the  pre-rational  aspects  of  legal  thinking,  espe¬ 
cially  the  influence  Gestalts  have  upon  the  per¬ 
ception  of  a  legal  problem.  Hence  we  have  not 
used  the  term  ‘legal  reasoning’  even  though  at 
many  points  —  for  example  in  dealing  with  le¬ 
gal  precedent  —  we  will  be  talking  about  pro¬ 
cesses  that  others  would  call  reasoning. 

Having  taken  this  broader  view,  we  would 
like  to  note  that,  on  the  surface  at  least,  legal 
thinking  and  perception  seem  to  have  nothing 
in  common.  Perception  involves  receiving  some 
stimulus  from  the  environment,  and  process¬ 
ing  it  in  some  way  to  integrate  it  in  the  concep¬ 
tual  system:  it  usually  involves  some  kind  of 
identification,  representation  or  description  of 
the  stimulus  in  terms  of  concepts.  Legal  think¬ 
ing,  on  the  other  hand,  involves  generating  ar¬ 
guments  for  a  case  as  to  why  a  certain  conclu¬ 
sion  follows  or  does  not  follow  from  the  given 
facts  of  the  case:  it  involves  a  complex  network 
of  rules  and  statutes,  precedents,  and  several 
extra-legal  factors  such  as  intents  of  the  law¬ 
makers,  social  and  political  context  and  so  on. 
How  could  these  two  seemingly  different  pro¬ 
cesses  be  related?  Among  cognitive  processes, 
legal  thinking  seems  as  far  removed  from  per- 
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ception  as  one  could  probably  get.  How  could 
a  model  of  perception  shed  any  light  on  gener* 
ating  a  legal  argument  for  a  given  case? 

We  would  like  to  argue  in  this  paper  that, 
notwithstanding  the  surface  appearances,  le¬ 
gal  thinking  can  indeed  be  viewed  as  percep¬ 
tion.  Morover,  we  would  like  to  show  that  a 
certain  model  of  perception,  which  we  refer 
to  as  the  Gestalt  interaction  model,  can  be  ap¬ 
plied  to  legal  thinking,  and  in  doing  so,  yields 
interesting  insights  into  how  precedential  rea¬ 
soning  works  in  law,  and  how  its  creative  as¬ 
pects  can  be  captured. 

This  article  is  organized  as  follows.  In  the 
next  section  we  give  provide  a  sketch  of  some 
key  ideas  and  principles  that  originated  from 
the  Gestalt  movement.  In  Section  3,  we  present 
some  examples  of  legal  thinking  that  reflect  the 
same  principles.  In  Section  4,  we  list  some  key 
features  of  our  proposed  architecture  to  model 
legal  thinking;  and  in  Section  5  we  examine 
briefly  some  implications  of  our  proposed  view 
with  respect  to  modeling  precendential  reason¬ 
ing  and  creativity  in  legal  thinking.  Finally, 
Section  6  contains  the  main  conclusions  of  this 
article  and  points  to  future  research  issues. 

2.  GESTALT  INTERACTION  IN 

PERCEPTION  AND  PROBLEM 
SOLVING 

A  major  finding  of  the  Gestalt  school  — 
which  was  started  during  the  early  part  of  the 
twentieth  century  by  Duncker,  Koffka,  Kohler, 
Luchins,  Maier,  Wertheimer,  and  others  —  was 
that  concepts  are  more  than  aggregates  of  sense 
data:  the  human  mind  prefers  to  see  the  world  in 
terms  of  structured  wholes,  even  when  the  struc¬ 
ture  is  lacking  in  the  stimuli.  The  term  Gestalt 
was  coined  to  refer  to  one  of  these  structured 
wholes.  Over  the  years,  the  members  of  this 
school  studied  extensively  the  principles  gov¬ 
erning  Gestalts  in  perception  and  problem  solv¬ 
ing.  For  example,  they  articulated  two  key  con¬ 
cepts,  namely  Einstellung  and  functional  fixity, 
to  explain  why  some  people  are  unable  to  solve 
certain  problems,  especially  in  situations  where 
there  is  a  simple,  albeit  hidden,  solution. 


Einstellung  occurs  when  a  problem  solver 
come  to  think  of  certain  types  of  problems  as 
capable  of  solution  in  only  one  way.  The  best 
example  is  Luchins  (1942)  water  jars  problem. 
Subjects  were  presented  with  three  (usually 
hypothetical)  water  jars  with  varying  volumes 
but  no  gradations,  and  asked  to  measure  out  a 
precise  goal  volume  of  water.  For  example,  if 
the  volumes  of  the  jars  A,  B  and  C  are  2 1 ,  1 27, 
and  3,  respectively,  and  the  goal  is  100,  then  a 
solution  of  the  problem  is  B-A-2C;  meaning 
that  first  fill  JarB  from  a  tap,  and  then  from  Jar 
B  fill  Jar  A  once  (leaving  106  cups  in  Jar  B) 
and  fill  Jar  C  twice  (leaving  100  cups).  Luchins 
found  that  after  solving  a  number  of  problems 
where  B-A-2C  solution  applies,  subjects  fail  to 
see  the  simpler  solution  of  a  problem  such  as 
23, 49,  3,  with  the  goal  being  20.  For  this  latter 
problem,  the  more  complex  B-A-2C  solution 
still  applies,  but  a  simpler  C-A  solution  is  also 
available.  The  Einstellung  predisposed  subjects 
to  solve  the  water  jugs  problem  in  a  certain  way. 

It  is  worth  noting  that  Einstellung  effects 
can  be  seen  in  the  representation  of  a  problem, 
as  well  as  the  ability  to  search  the  state  space 
of  the  problem.  The  water  jars  experiment  is 
an  example  of  Einstellung  in  state  space  search. 
Kellogg  (1995)  gives  an  example  of  Einstel¬ 
lung  in  representation.  A  group  of  New  York 
mathematics  students  set  their  professor  the  task 
of  finding  the  next  member  of  the  sequence  32, 
38,  44,  48,  56,  60.  They  even  hinted  that  the 
answer  was  easy  and  well-known  to  the  pro¬ 
fessor.  After  some  complex  calculations,  the 
professor  generated  a  difficult  mathematical 
solution.  *No’  replied  the  students,  the  next 
member  was  ’Meadowlark.’  They  explained 
that  the  professor  rode  the  subway  everyday: 
the  stops  being  32nd  St,  38th  St,  44th  St,  48th 
St,  56th  St,  60th  St,  and  then  Meadowlark.  Ein¬ 
stellung  in  representation  had  meant  the  pro¬ 
fessor  was  unable  to  see  the  solution. 

Functional  fixity  is  a  similar  principle  to 
Einstellung,  but  refers  specifically  to  the  use 
of  tools  or  an  object  to  solve  a  problem.  Stud¬ 
ies  show  that  a  tool  comes  to  associated  with 
a  particular  function  X,  and  therefore  its  use 
for  function  Y  is  often  not  seen.  The  quintes- 
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sential  examples  are  the  classic  candle  prob¬ 
lem  of  Duncker  (1945)  and  two-cord  problem 
of  Maier  (1931).  In  the  former,  subjects  were 
given  a  candle,  a  box,  drawing  pins  and  a  ham¬ 
mer,  and  asked  to  fix  the  candle  to  a  door  so 
that  it  could  be  lit.  The  solution  was  to  ham¬ 
mer  the  box  to  the  door  with  the  drawing  pins, 
and  use  it  as  a  stand  for  the  candle.  The  prob¬ 
lem  was  much  more  difficult  if  the  box  was 
used  to  store  the  candles  and  drawing  pins.  The 
subjects  thought  of  the  box’s  function  as  that 
of  container  only,  ignoring  its  value  as  a  stand. 
In  Maier’ s  experiment,  subjects  were  asked  to 
tie  together  the  free  ends  of  two  cbfds  hang¬ 
ing  from  the  ceiling.  They  were  given  a  num¬ 
ber  of  tools  (for  example  a  hammer)  but  the 
cords  were  set  further  apart  than  the  subject 
could  reach.  The  solution  was  to  tie  the  ham¬ 
mer  onto  one  cord,  and  set  it  swinging;  In  this 
way,  the  subject  could  hold  one  cord,  and  catch 
the  other  one  as  the  newly-created  pendulum 
swung  towards  them. 

Interesting,  functional  fixity  operates  in  a 
similar  way  to  Einstellung,  in  that  prior  experi¬ 
ence  can  enhance  the  fixity.  So  for  example 
Birch  and  Rabinowitz  (1951)  had  subjects  build 
an  electrical  circuit  prior  to  the  cord  problem. 
The  electrical  circuit  could  be  completed  with 
either  a  switch  or  relay.  The  subjects  were  then 
presented  with  the  cord  problem  and  a  prompt. 
In  choosing  a  pendulum  weight  they  over¬ 
whelmingly  picked  the  tool  (ie  switch  or  relay) 
that  they  had  previously  not  used  in  the  circuit. 
So  100%  of  those  using  the  relay  in  the  circuit, 
used  the  switch  as  a  weight,  and  77%  of  the 
switch-users  used  the  relay  as  a  weight.  When 
asked  why  they  had  chosen  their  given  tool  (ie 
switch  or  relay)  the  subjects  explained  why  it 
was  the  only  tool  available. 

These  hindrances  to  problem  solving  lead 
to  the  notions  of  productive  and  reproductive 
thinking  (Wertheimer,  1945).  Productive  think¬ 
ing  involves  a  recognition  of  the  relations  be¬ 
tween  elements  in  the  problem  space  (its  Ge¬ 
stalt)  and  the  restructuring  the  elements  into  a 
new  Gestalt  which  provides  the  problem  solu¬ 
tion.  Reproductive  thinking  is,  antithetically, 
merely  the  repetition  of  a  learned  response. 


The  difference  can  be  seen  in  some  early 
work  with  animals.  Thorndike  (191 1 )  placed 
some  hungry  cats  in  a  box  which  had  a  lever  in 
it.  The  lever  opened  the  door  leading  to  food. 
The  cats  would  thrash  about  in  the  cage  and 
would  occasionally  knock  the  level,  thereby 
opening  the  door.  Thorndike  showed  that  hav¬ 
ing  done  this  a  number  of  times,  the  cats  would 
gradually  learn  to  hit  the  lever.  This  is  an  exam¬ 
ple  of  reproductive  thinking.  Alternatively  there 
were  ape  studies  of  Kohler  (1927)  where  he  re¬ 
ported  chimpanzees  joining  two  sticks  together 
to  reach  food  outside  their  cages,  in  circumstanc¬ 
es  where  they  had  not  been  shown  how  to  do 
this.  This  type  of  productive  thinking  relied  on 
an  insight,  though  it  can  be  improved  through 
hints  even  where  the  subject  may  be  unaware  of 
the  hint.  Maier  reported  in  his  two-cord  prob¬ 
lem  that  subjects  more  often  reached  the  pendu¬ 
lum  solution  when  an  assistant  ‘accidentally’ 
brushed  against  one  of  the  cords  setting  it  in  slight 
motion.  And  this  result  occurred  even  when  the 
subjects  could  not  recall  the  assistant  brushing 
the  cord.This  kind  of  subconscious  context  ef¬ 
fects  have  also  been  more  recently  demonstrat¬ 
ed  by  Kokinov  and  Yoveva  (1996). 

Gestalt  psychology  has  enjoyed  a  recent 
renaissance,  with  a  number  of  its  findings  pro¬ 
viding  insight  into  modern  research  questions 
(Keane  1988;  Garnham  and  Oakhill  1994). 
Though  the  current  paradigm  in  cognitive  sci¬ 
ence  focuses  on  information  theory  and  prob¬ 
lem-space  conceptions  of  perception  and  prob¬ 
lem  solving,  some  of  the  models  of  the  Ge¬ 
stalt  school  have  been  re-interpreted  in  light 
of  information  processing  theory.  (See  Brown 
1989;  Dominowski  1981;  Keane  1985,  1989; 
Newell  1980;  Ohlsson  1984a,  1984b,  1985, 
1992;  Weisberg  and  Alba  1981,  1982;  Weis- 
berg  and  Suls  1973;  and  particularly  the  in¬ 
fluential  account  of  vision  given  by  Marr 
1982.)  Our  model  of  legal  thinking  as  percep¬ 
tion  follows  on  in  this  tradition. 

In  the  information  processing  model  of 
mind  and  perception  (Lachman,  Lachma,  and 
Butterfield,  1979;  Eysenck,  1993),  information 
is  presented  to  the  organisnl  which  is  perceived, 
and  then  processed,  eventually  leading  to  a  re- 
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sponse.  In  this  model,  the  starting  point  is  the 
stimulus  from  the  external  environment,  which 
causes  certain  internal  cognitive  or  conceptual 
processes.  This  type  of  processing  is  called 
bottom-up  or  stimulus-driven  processing,  since 
it  starts  with  perception  of  the  most  fundamen¬ 
tal  stimuli  at  the  bottom,  and  then  works  its  way 
up  into  the  more  abstract  conceptual  process¬ 
ing  system  (Eysenck  and  Keane,  1995;  Neiss- 
cr,  1976).  It  is  involved  in  most  perceptual  tasks: 
understanding  the  visual  field,  comprehending 
phonemes,  interpreting  touch  sensation,  and  so 
on.  And  the  main  contribution  of  the  Gestalt 
approach  here  has  been  to  assert  the  role  of  top- 
down  processing  in  perceptual  tasks. 

Indeed,  while  bottom-up  processing  is 
clearly  important,  it  is  not  the  whole  story.  For 
what  you  see,  depends  a  great  deal  on  what 
you  want  to  see,  what  else  is  there  to  see,  what 
else  have  you  seen  before,  and  so  on.  For  ex¬ 
ample,  in  spoken  word  recognition,  recogni¬ 
tion  response  times  are  lower  when  other  lex¬ 
ical,  syntactic  or  semantic  information  is  pre¬ 
sented  with  the  word  (Marslen-Wilson  and 
Tyler,  1980).  Thus,  a  subject  would  recognise 
the  word  ‘butter’  more  easily  if  they  have  just 
heard  the  word  ‘bread*  than  if  they  have  heard 
the  word  ‘motor  oil*  (Eysenck  1993;Tulving, 
Mandlerand  Baumel  1964).  In  Gestalt  terms, 
the  prompt  of  ‘bread*  will  alter  the  Gestalt  we 
have  in  the  associations  between  words,  and 
hence  alter  reaction  times  to  the  next  word. 
This  is  related  to  the  Einstellung  findings  made 
by  Luchins  (1942),  described  above.  In  a  sim¬ 
ilar  vein,  the  ‘phonemic  restoration  effect’  has 
also  been  demonstrated  where  top-down  pro¬ 
cessing  modifies  the  perception  of  a  single 
word  ‘cel*  in  the  following  sentences:  ‘It  was 
found  that  the  eel  was  on  the  axle*  (wheel),  ‘It 
was  found  that  the  eel  was  on  the  shoe’  (heel), 
‘It  was  found  that  the  eel  was  on  the  orange’ 
(peel)  and  so  forth  (Warren  and  Warren,  1970; 
Samuel  1981).  So  we  sec  that  both  top-down 
and  bottom-up  components  are  two  wheels 
connected  to  the  same  axle,  and  are  both  nec¬ 
essary  for  the  cognition  to  proceed.  Combin¬ 
ing  the  two  approaches  we  get  what  we  refer 
here  as  the  Gestalt  Interaction  view. 


Though  the  term  ‘Gestalt  interaction’  may 
be  new,  the  ideas  underlying  it  have  been  around 
for  quite  some  time  (Neisser,  1976,  Pinker, 
1985,  Uliman,  1985).  More  recently,  one  of  us 
(Indurkhya  1992),  proposed  a  formal  frame¬ 
work  in  which  concepts  and  stimuli  can  inter¬ 
act  together  to  generate  ’representations’.  Com¬ 
putationally,  reasonable  models  of  visual  per¬ 
ception  and  speech  recognition  have  always  em¬ 
ployed  a  mix  of  top-down  and  bottom-up  con¬ 
trols  (Erman  ct  al.  1980;  Mandal,  Murthy  & 
Sankar,  1996;  Riseman  and  Hanson,  1987).  It 
is  a  similar  model  that  we  propose  to  apply  for 
legal  thinking. 

3.  GESTALT  INTERACTION  IN  LAW 

There  is  a  difficulty  with  the  application 
of  gestalt  interactionist  model  of  perception  to 
law.  That  model  was  developed  to  explain  fea¬ 
tures  of  perception:  vision  processing,  word 
recognition,  and  the  like.  Legal  reasoning  seems 
to  operate  at  a  higher,  more  abstract  level.  So 
we  must  first  identify  what,  in  law,  corresponds 
to  stimuli  and  gestalts,  and  then  proceed  to  ar¬ 
ticulate  what  are  the  top-down  and  bottom -up 
processes. 

Generally  speaking,  legal  reasoning  starts 
from  the  facts  of  a  given  case,  and  proceeds  to 
establish  whether  certain  legal  conclusions  fol¬ 
low  from  the  facts  or  not.  Whereas  the  facts  of 
the  case  are  usually  expressed  in  concrete  terms, 
the  conclusions  involve  high-level  abstract  con¬ 
cepts  such  as  ‘negligence’,  ‘duty  of  care*,  and 
so  on.  Thus,  for  the  first  stage  in  our  analysis 
we  can  regard  the  facts  as  the  stimuli,  and  legal 
concepts  as  Gestalts  which  structure  the  facts 
in  certain  ways. 

The  implications  of  this  view  of  legal  think¬ 
ing  are  fairly  obvious.  A  judge,  in  deciding  a 
case  before  her,  will  be  presented  with  a  series 
of  stimuli.  These  will  not  be  interpreted  neu¬ 
trally.  Instead,  the  existing  Gestalt  of  the  judge 
will  dramatically  influence  her  perception  of 
it.  Further,  as  a  judge  seeks  to  move  from  one 
Gestalt  to  another,  we  should  be  able  to  see  in 
law  Gestalt  effects  such  as  Einstellung  and  func¬ 
tional  fixity.  Though  there  is,  regrettably,  no 
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empirical  data  on  Gestalts  in  law,  we  can  none¬ 
theless  see  these  effects  in  one  set  of  data  we 
do  have,  the  legal  cases  themselves. 

Perhaps  the  most  pursuasi  ve  demonstration 
of  how  Gestalts  arise  and  what  a  critical  role 
they  play  in  legal  thinking,  and  how  they  shift 
over  the  course  of  time  is  made  by  Levi  ( 1 948). 
In  one  of  his  fascinating  case  study,  he  showed 
how  the  Gestalts  ‘things  imminently  danger¬ 
ous*  and  ‘things  inherently  dangerous*  had  a 
ramarkable  influence  on  the  legal  issue  of  lia¬ 
bility,  and  how  they  have  evolved  over  the  last 
two  centuries. 

Another  good  example  is  the  Australian  law 
on  whether  Aborigines  had  sovereignty  and  land 
rights.  Until  recently,  the  indigenous  people  of 
Australia  had  few  if  any  proprietary  rights  in 
Australian  land.  When  one  considers  that  the 
Australian  indigenous  people  had  settled  the 
land  some  40,000  years  prior  to  the  English 
invasion,  this  seems  unfair.  It  is  even  more  un¬ 
fair  when  one  realises  that  under  English  law 
the  aborigines  should  have  been  granted  limit¬ 
ed  sovereignty  over  Australia.  At  the  time  of 
the  settlement  of  Australia,  English  law  drew 
the  distinction  between  lands  that  were  colo¬ 
nised  where  there  was  an  existing  population 
of  people,  and  lands  that  were  settled  where 
there  were  no  people.  Where  the  land  was  col¬ 
onised,  the  indigenous  laws  of  the  people  re¬ 
mained,  but  where  the  land  was  empty  —  or  in 
the  Latin  terra  nullius  —  English  law  landed  at 
the  same  moment  as  the  first  foot  of  the  British 
seafarers.  Under  British  colonial  rule,  Austra¬ 
lia  was  held  to  be  terra  nullius  at  the  time  of 
white  settlement.  This  was  nothing  more  than 
a  patent  fiction,  as  the  evidence  of  its  falsity  — 
the  native  people,  their  settlements,  their  tools, 
their  culture — was  present  everywhere.  None¬ 
theless  the  fiction  remained  and  it  was  held  that 
the  only  property  laws  in  Australia  were  those 
stemming  from  the  introduction  of  white  rule; 
laws  which  were  less  than  generous  in  their 
grant  of  land  to  Aborigines. 

The  original  cases  —  created  during  the 
1800s  in  an  era  of  laissezfaire  capitalism  and 
blatant  racism  —  created  the  initial  Gestalt  to 
limit  aboriginal  holdings  of  land,  except  as  a 


consequence  of  the  English  property  law.  Sub¬ 
sequent  cases  merely  adopted  the  principle  that 
Australia  was  ‘empty  land’  even  though  the  fic¬ 
tion  was  always  obvious.  Each  case  therefore 
is  a  good  example  of  the  Einstellung  effect, 
where  the  perception  of  the  appropriate  out¬ 
come  was  set  by  the  previous  cases.  It  is  incon¬ 
ceivable  that  no  judge  in  these  cases  —  wheth¬ 
er  at  trial  or  during  any  of  the  numerous  ap¬ 
peals  that  they  entailed  —  never  perceived  the 
term  ‘empty  land*  to  be  at  odds  with  their  even¬ 
tual  decision  to  uphold  white  rule. 

Like  the  water  jar  experiments  of  Luchins 
(1942)  the  perception  was  influenced  by  Ein¬ 
stellung.  However,  as  with  the  jars,  an  alterna¬ 
tive  Gestalt  can  supplant  the  original.  This  hap¬ 
pened  in  the  case  of  Mabo  v  Queensland  (No, 2). 
[(1992)  175  CLR  1].  In  Mabo  the  Australian 
High  Court  held  that  previous  decisions  hold¬ 
ing  that  Australia  was  terra  nullius  at  settle¬ 
ment,  and  consequently  that  Aborigines  had  no 
indigenous  property  rights,  were  wrong  at  law. 
This  is  an  interesting  decision  since  the  court 
did  not  decide  to  change  the  law  to  accommo¬ 
date  modem  developments,  in  the  way  we  see 
this  done  in  fields  as  diverse  as  homicide  (in¬ 
cluding  a  new  defence  for  ‘battered  wives*)  or 
tax  (making  modern-day  tax  evasion  illegal)  or 
discrimination  law  (adding  age  or  sexual-pref¬ 
erence  as  grounds  for  anti-discrimination  suits). 
Instead  the  court  went  back  to  the  basic  terra 
nullius  formulation  at  the  time  of  white  settle¬ 
ment,  and  concluded  that  previous  courts  were 
wrong  according  to  the  law  at  the  time.  Not¬ 
withstanding  prior  cases  to  this  effect,  the  High 
Court  said  that  Australia  could  not  have  been 
an  empty  land  at  settlement,  since  the  Aborigi¬ 
nal  presence  meant  that,  at  the  law  of  the  time, 
it  was  a  colonised  country.  Aboriginal  law  had 
thus  remained  in  force  for  the  200  years  that 
the  white  courts  had  declared  that  it  never  ex¬ 
isted.  This  is  a  remarkable  example  of  an  al¬ 
tered  Gestalt,  though  related  processes  occur 
all  the  time  as  judges  adapt  laws  to  social  needs. 

Another  example  is  one  which  focuses  on  a 
process  that  appears  to  be  similar  to  Maier’s  two- 
cord  problem  and  functional  fixity.  Clearly  law 
does  not  deal  directly  with  physical  tools.  How- 
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ever,  cases  can  be  seen  as  one  of  the  tools  of 
legal  thinking.  This  differs  somewhat  from  our 
earlier  characterisation  that  the  case  to  be  decid¬ 
ed  is  a  stimulus,  but  there  is  no  inconsistency 
here.  The  Gestalt  psychologists  realised  that  per¬ 
ception  and  problem  solving  are  intimately  re¬ 
lated,  and  are  both  reliant  on  Ge.stalts.  In  the  le¬ 
gal  field,  the  Gestalt  affects  the  perception  of 
the  current  case,  as  mentioned  above.  It  will  also 
the  ability  to  solve  the  ‘problem*  of  the  case, 
using  the  cases  available  to  the  judge.  These  cases 
then  can  form  their  tools,  and  only  some  of  them 
are  going  to  be  useful  to  solve  any  given  legal 
problem.  The  ability  of  the  judge  to  use  these 
tools  should  therefore  display  similar  Ge.stalt 
characteristics,  including  functional  fixity.  We 
can  demonstrate  this  with  two  examples;  the  first 
from  Anglo-Australian  family  law  and  the  sec¬ 
ond  from  English  contract  law. 

When  a  married  couple  divorces,  the  divi¬ 
sion  of  property  is  determined  in  large  part  by 
the  old  case  law  of  ‘Husband  and  Wife’  and  by 
various  Acts.  In  Australia  and  England  at  least, 
these  generally  provide  for  division  according 
to  economic  added  into  the  marital  a.ssets.  This 
was  plainly  unjust  where  the  husband  had 
worked,  while  the  wife  cared  for  children  and 
maintained  the  household.  In  this  situation,  the 
standard  decision  was,  until  recently,  that  the 
hUvSband  would  get  the  lion’s  share  of  the  prop¬ 
erty.  However  in  an  example  of  productive  think¬ 
ing,  one  court  introduced  a  principle  from  a  com¬ 
pletely  different  area  of  law  and  held  that  the 
wife’s  work  placed  into  the  house  meant  she  had 
an  equitable  interest  in  it.  The  husband,  though 
legally  the  owner  of  the  house,  actually  held  part 
of  it  in  ‘constructive  trust*  for  his  wife.  This  de¬ 
cision  was  soon  followed  by  a  number  of  other 
courts,  and  is  now  the  standard  approach. 

This  is  an  example  of  using  a  tool  — 
‘constructive  trusts*  —  in  a  way  that  was  nev¬ 
er  intended  by  the  original  creators  of  the  prin¬ 
ciple.  Another  is  the  decision  of  Lord  Den¬ 
ning  in  the  High  Trees  case,  which  modified 
contract  law  by  introducing  another  equitable 
principle,  this  time  one  called  ‘promissory 
estoppel.’  The  details  of  this  need  not  detain 
us,  but  suffice  to  say  that  a  legal  concept  from 


a  different  area  was  drafted  into  service  to  deal 
with  a  problem  in  contract  law.  Both  this,  and 
the  family  law  example,  show  that  a  type  of 
functional  fixity  exists  in  law,  but  that  this  can 
be  broken  down  under  pressure. 

4.  AN  ARCHITECTURE  FOR  LEGAL 
THINKING 

To  model  legal  thinking  as  Gestalt  interac¬ 
tion,  we  propose  an  architecture  based  on  ‘anal¬ 
ogy  as  high-level  perception*  approach  of  Hof- 
stadterand  his  colleagues  (1995),  and  contain¬ 
ing  many  ideas  derived  from  computational 
models  of  perception  especially  speech  recog¬ 
nition  (Erman  ct  al.  1982)  and  machine  vi.sion 
(Riseman  and  Hanson,  1987;  Ullman,  1985). 
The  key  features  of  our  proposed  architecture 
are  as  follows: 

•  a  multi-layer  representation  is  used,  with 
the  bottom  layer  containing  the  concrete 
facts,  and  the  top  layer  containing  the  Ge¬ 
stalts  and  the  rationale  for  the  decision  (ra~ 
tin  decidendi)  in  terms  of  the  Gestalts.  In¬ 
tervening  levels  contain  intermediate  con¬ 
cepts  and  categories  that  mediate  the  tran¬ 
sition  from  facts  to  Gestalts. 

•  The  process  of  legal  thinking  is  seen  as  that 
of  coming  up  with  a  Gestalt  representation 
in  the  top  layer,  given  the  facts  in  the  bot¬ 
tom  layer. 

•  The  process  is  mediated  by  both  top-down 
and  bottom-up  operators.  A  top-down  op¬ 
erator  tries  to  fit  the  more  concrete  data  of 
the  lower  layer  into  the  Gestalt  of  the  up¬ 
per  layer.  A  bottom-up  operator  activates 
a  certain  Gestalt  in  the  upper  layer  when  a 
pattern  is  detected  at  the  lower  layer. 

•  There  may  also  be  intra-level  operators 
that  connect  concepts  (Gestalts  ot  facts) 
within  the  same  level.  They  may  work  in 
the  forward  direction  (from  the  conclu¬ 
sions  so  far  reached,  derive  new  conclu¬ 
sions)  or  in  the  backward  direction  (to 
reach  a  desired  conclusion,  posit  the  nec¬ 
essary  sub-conclusions). 
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•  The  operators  embody  statutory  knowl¬ 
edge,  heuristic  knowledge,  extra-legal  fac¬ 
tors,  and  so  on, 

•  Certain  Gestalts  may  be  preactivated  in  the 
top  layer  to  reflect  the  bias  or  the  predispo¬ 
sition  of  the  cognitive  agent,  or  to  reflect 
the  current  legal  doctrines. 

5.  SOME  IMPLICATIONS  OF  THE 
PROPOSED  VIEW 

The  model  of  legal  thinking  outlined  in  the 
last  section  has  some  significant  implications, 
especially  when  compared  to  the  existing  ap¬ 
proaches  to  legal  reasoning.  Here  we  will  briefly 
examine  two  such  implications. 

5J  Precedential  reasoning 

The  traditional  approaches  to  precedential 
reasoning  in  law  invariably  involve  some  kind 
of  matching  of  the  facts  of  the  given  case  with 
the  cases  stored  in  the  case  library  (Ashley, 
1990;  Branting,  1993).  In  these  approaches,  the 
representations  of  the  cases  are  kept  fixed,  so 
they  are  not  able  to  model  the  process  of  rein¬ 
terpretation  of  old  cases  and  Gestalt  shifts  as 
new  cases  are  considered,  as,  for  example,  re¬ 
counted  in  Levi  (1948),  In  our  model,  howev¬ 
er,  each  case  is  represented  as  a  multi-layered 
network  connecting  the  concrete  facts  of  the 
case  with  the  Gestalts  that  were  found  applica¬ 
ble  in  its  decision.  And  when  these  networks 
are  activated  in  order  to  build  a  representation 
for  the  given  facts  of  a  new  case,  the  process  is 
far  more  complex  and  subtle  than  matching 
parts  of  the  new  case  against  portions  of  the 
stored  cases.  In  this  process,  the  old  cases  are 
as  likely  to  be  reinterpreted  as  the  new  case, 
and  it  may  result  in  a  slight  or  a  drastic  change 
in  the  Gestalts  at  the  top  level. 

5.2  Creativity  in  legal  thinking 

Though  one  might  expect  creativity  to  be 
an  anathema  in  legal  thinking,  we  need  not  look 
very  hard  to  find  many  instances  where  a  cer¬ 
tain  degree  of  creativity  was  involved.  In  such 


situations,  the  creativity  often  lies  in  the  Ge¬ 
stalt  switch.  In  modeling  this  phenomenon,  a 
key  question  is:  where  does  the  new  Gestalt 
come  from?  One  possible  answer  to  this  is  that 
it  Comes  from  some  other  case.  One  of  us  has 
pursued  this  idea  elsewhere  (Indurkhya,  1997) 
to  show  how  creative  insights  can  result  from 
applying  a  Gestalt  from  one  case  to  reinterpret 
another  case.  In  particular,  it  was  shown  there 
how,  given  two  precedents  PI  and  P2,  and  a 
new  case  N,  if  PI  and  P2  are  individually  ap¬ 
plied  to  N,  a  certain  conclusion  can  be  derived 
for  the  outcome  of  N;  but  if  the  Gestalt  of  P 1  is 
used  to  reinterpret  P2,  and  then  reinterpreted 
P2  is  applied  to  N,  the  opposite  conclusion  for 
N  can  be  derived. 

6.  CONCLUSIONS 

We  have  argued  here  that  Gestalt  princi¬ 
ples  can  help  us  understand  a  number  of  fea¬ 
tures  about  legal  thinking.  Notably,  it  begins  to 
explain  why  law  seems  to  be  a  fairly  static  pro¬ 
cess  of  case  and  rule  application.  This  is  due  in 
part  to  the  Einstellung  and  functional  fixity  ef¬ 
fects  inherent  in  the  adoption  of  one  particular 
Gestalt.  It  further  explains  however,  why  the 
law  goes  through  upheaval  at  certain  times,  as 
one  Gestalt  is  swapped  for  another. 

This  view  differs  from  the  traditional,  ra¬ 
tionalist,  formalist  view  of  legal  reasoning, 
where  legal  concepts  are  represented  as  suffi¬ 
cient  and  necessary  conditions,  the  rigid  appli¬ 
cation  of  which  will  lead  to  perfect  justice.  This 
view  is  one  which  is  rarely  accepted  these  days. 
Even  in  Levi’s  day,  it  was  under  attack:  “It  is 
important  that  the  mechanism  of  legal  reason¬ 
ing  should  not  be  concealed  by  its  pretense.  The 
pretense  is  that  the  law  is  a  system  of  known 
rules  applied  by  a  judge;  the  pretense  has  long 
been  under  attack.”  Levi  (1948.  p.  1 ) 

Nonetheless,  the  view  that  legal  reasoning 
or  legal  thinking  is  dependent  on  formal  prin¬ 
ciples  is  one  that  dies  hard.  In  order  to  advance 
our  Gestalt  interaction! st  model  of  legal  think¬ 
ing  over  the  formalist  view,  we  need  to  exca¬ 
vate  more  carefully  what  Levi  calls  the  ‘mech¬ 
anism  of  legal  reasoning.’  The  ideas  presented 
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here  barely  scratch  the  surface.  Just  as  Gestalt 
school  formulated  many  principles  to  explain 
why  certain  Gestalts  are  preferred  over  others, 
we  also  need  to  articulate  in  more  detail  why 
Gestalts  in  legal  thinking  shift  the  way  they  do; 
what  necessitates  a  Gestalt  switch;  where  do 
the  new  Gestalts  come  from;  and  so  on.  This 
would  require  much  empirical  work  —  in  terms 
of  case  studies  and  perhaps  also  experiments 
involving  practising  attroneys  and  judges.  From 
such  studies  we  may  be  able  get  a  glimpse  of 
what  kinds  of  top-down  and  bottom-up  process¬ 
es  are  active  In  legal  thinking,  how  they  are 
constrained  and  how  they  constrain  legal  Ge¬ 
stalts.  We  seek  to  continue  this  line  of  research 
in  future,  and  hope  that  our  ideas  will  inspire 
others  to  join  in  this  endeavour. 
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ABSTRACT 

Depictions,  such  as  maps,  that  portray  vis¬ 
ible  things  are  ancient  whereas  depictions,  such 
as  graphs  and  diagrams,  that  portray  things  that 
are  inherently  not  visible,  are  relatively  mod¬ 
em  inventions.  They  serve  a  variety  of  func¬ 
tions,  such  as  providing  models,  attracting  at¬ 
tention,  supporting  memory,  facilitating  infer¬ 
ence  and  discovery.  Depictions  use  space  to 
convey  meaning  in  ways  that  are  cognitively 
natural,  as  suggested  by  historical  and  devel¬ 
opmental  examples.  Typically,  icons  are  used 
to  convey  elements,  based  on  likenesses  and 
‘‘figures  of  depiction*’  and  spatial  relations  are 
used  to  convey  other  relations,  based  on  prox¬ 
imity. 

INTRODUCTION 

Graphics  are  one  of  the  oldest  and  newest 
form  of  communication.  Long  before  there  was 
written  language,  there  were  pictures,  of  myri¬ 
ad  varieties.  A  few  of  the  multitude  of  cave 
paintings,  petroglyphs,  bone  incisions,  clay 
impressions,  stone  carvings,  and  wood  mark¬ 
ings  that  people  fabricated  and  used  remain 
from  ancient  cultures.  Some  of  these  prealpha- 
betic  depictions  probably  had  religious  signifi¬ 
cance,  but  many  were  undoubtedly  used  to  com¬ 
municate,  to  keep  track  of  events  in  time,  to 
note  ownership  and  transactions  of  ownership, 
to  map  places,  to  record  songs  and  sayings,  and 
to  transmit  messages  (e.  g.,  Coulmas,  1 989;  I>e 
Frances,  1989;  Gelb,  1963;  Mallcry.  .1893/ 
1972;  Schmandt-Besserat,  1992).  As  such,  they 
served  as  permanent  records  of  history,  com¬ 
memorations  of  cultural  past.  Because  pictures 


represent  meaning  more  directly  than  alphabetic 
written  languages,  we  can  guess  at  their  mean¬ 
ings  today.  In  rare  cases,  we  have  the  benefit  of 
contemporaneous  translations.  Mallery,  for 
example,  was  able  to  speak  with  native  Amer¬ 
icans  still  using  pictographic  communication 
as  he  collected  vast  numbers  of  their  petro¬ 
glyphs,  birch  bark  markings  (1893/1972). 

In  many  places,  the  use  of  pictures  to  com¬ 
municate  developed  into  complete  written  lan¬ 
guages.  All  such  languages  invented  ways  to 
represent  concepts  that  are  difficult  to  depict, 
such  as  abstract  meanings  and  proper  names. 
Some  pictoric  languages  transformed  and  be¬ 
gan  using  written  marks  to  represent  the  sound 
of  spoken  language  rather  than  using  marks  to 
repre.sent  meaningdirectly.  As  pictures  evolved 
into  written  languages,  their  transparency  dis¬ 
appeared.  Characters  representing  abstract  con¬ 
cepts  were  devised  and  characters  representing 
concrete  concepts  became  schematized  and 
conventionalized.  Later,  the  invention  and 
spread  of  the  alphabet,  and  then  the  invention 
of  the  printing  press  decreased  reliance  on  pic¬ 
tures  for  communication.  With  the  increasing 
case  of  reproducing  written  language  and  the 
spread  of  literacy,  pictures  became  decorative 
rather  than  communicative. 

Now,  pictures,  depictions,  and  visualiza¬ 
tions  are  on  the  rise  again.  As  with  the  prolifer¬ 
ation  of  written  language,  this  is  partly  due  to 
technologies  for  reproducing  and  transmitting 
pictures.  And  as  with  the  proliferation  of  writ¬ 
ten  language,  some  of  the  expansion  of  pictures 
is  due  to  intellectual  insights.  For  this,  the  ba¬ 
sic  insight  is  using  depictions  to  represent  ab¬ 
stract  meaning  by  means  of  visual  and  spatial 
metaphors  and  figures  of  depiction.  Although 
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depictions  have  long  been  used  to  convey  con¬ 
crete  ideas,  their  use  in  conveying  abstract  ideas 
is  more  recent.  Early  depictions  for  the  most 
part  portrayed  things  that  were  inherently  visu- 
alizable,  such  as  objects  or  environments,  in 
pictographs,  maps,  or  architectural  plans.  Vi¬ 
sualizations  of  things  that  are  not  inherently 
visualizable,  such  as  temporal,  economic,  caus¬ 
al,  or  social  relations  are  a  modem  invention. 
These  depictions  depend  on  analogy  rather  than 
miniaturization  or  enlargement. 

Graphs  are  perhaps  the  most  prevalent  ex¬ 
ample  of  depictions  of  abstract  concepts,  though 
not  invented  until  the  late  eighteenth  century 
(e.  g.,  Beniger  and  Robyn,  1978;  Carswell  and 
Wickens,  1988;  Tufte,  1983),  although  they 
probably  had  their  roots  in  mathematical  nota¬ 
tion,  especially  Cartesian  coordinate  systems. 
Two  Europeans,  Playfair  in  England  and  Lam¬ 
bert  in  Switzerland,  are  credited  with  being  the 
first  to  promulgate  their  use,  for  the  most  part 
to  portray  economic  and  political  data. 

Although  those  early  graphs,  X-Y  plots 
with  time  as  one  of  the  variables,  are  still  the 
most  common  type  of  graph  in  scientific  jour¬ 
nals  (Cleveland,  1984),  varieties  of  graphs, 
graphics,  and  visualizations  abound,  with  new 
ones  appearing  all  the  time.  Bar  graphs  and  pie 
charts  are  common  for  representing  quantita¬ 
tive  data,  with  flow  charts,  trees,  and  networks 
widely  used  for  qualitative  data.  Icons  appear 
in  airports,  train  stations,  and  highways  all  over 
the  world,  and  menus  of  icons  on  information 
highways  over  the  world.  Many  are  used  to 
portray  concepts  that  are  difficult  to  visualize. 

The  choices  of  icons  and  graphic  displays 
are  usually  not  accidental  or  arbitrary.  Many 
have  been  invented  and  reinvented  by  adults 
and  children  across  cultures  and  time.  Many 
have  analogs  in  language  and  in  gesture  and 
parallels  in  Gestalt  principles  of  perceptual  or¬ 
ganization.  They  seem  rooted  in  natural  cogni¬ 
tive  correspondences,  “figures  of  depictions,” 
and  spatial  metaphors. 

In  this  paper,  I  present  an  analysis  of  graph¬ 
ic  displays  based  on  their  functions  and  on  their 
stmcture.  The  evidence  I  will  bring  to  bear  is 
eclectic  and  unconventional,  drawing  from  ex¬ 


aminations  of  historical  graphic  inventions, 
children’s  graphic  inventions,  and  language. 

Other  Approaches.  Others  have  taken  a 
broad  view  of  graphics  from  other  perspectives, 
Bertin  (1981)  put  forth  a  comprehensive  semi¬ 
otic  analysis  of  the  functions  of  graphics  and 
the  processes  used  to  interpret  them  that  estab¬ 
lished  the  field  and  defined  the  issues.  Accord¬ 
ing  to  Bertin,  the  functions  of  graphs  are  to 
record,  communicate,  and  process  information, 
and  the  goal  of  a  good  graphic  is  simplification 
to  those  ends.  Ittelson  (1996)  has  pointed  to 
differences  in  processing  of  “markings,”  delib¬ 
erate,  two-dimensional  inscriptions  on  surfac¬ 
es  of  objects  and  other  visual  stimuli.  Winn 
(1987)  has  discussed  how  information  is  con¬ 
veyed  in  charts,  diagrams,  and  graphs.  Larkin 
and  Simon  (1987)  have  examined  the  differ¬ 
ences  between  sentential  and  diagrammatic 
external  representations,  pointing  to  the  advan¬ 
tages  of  diagrammatic  ones  for  tasks  where  spa¬ 
tial  proximity  conveys  useful  information.  Sten- 
ning  and  Oberlander  (1995)  have  analyzed  the 
advantages  and  disadvantages  of  diagrammat¬ 
ic  and  sentential  representations  in  drawing  in¬ 
ferences.  They  argue  that  diagrams  allow  ex¬ 
pression  of  some  abstractions,  much  like  natu¬ 
ral  language,  but  are  not  as  expressive  as  sen¬ 
tential  logics.  Cleveland  (1984;  1985)  has  ex¬ 
amined  the  psychophysical  advantages  and  dis¬ 
advantages  of  using  different  graphic  elements, 
position,  angle,  length,  slope,  and  more,  for  ef¬ 
ficiency  in  extracting  different  kinds  of  infor¬ 
mation  from  displays  of  quantitative  data.  He 
and  his  collaborators  have  produced  convinc¬ 
ing  cases  where  conventional  data  displays  can 
be  easily  misconstrued  by  human  users.  Tufte 
(1983,  1990,  1997)  has  exhorted  graphic  de¬ 
signers  to  refrain  from  “chart  junk,”  extrane¬ 
ous  marks  that  convey  no  additional  informa¬ 
tion,  adopting  by  contrast  a  minimalist  view. 
Wainer  (1984, 1992)  has  gathered  a  set  of  use¬ 
ful  prescriptions  and  insightful  examples  for 
graph  construction,  drawing  on  work  in  semi¬ 
otics,  design,  and  information  processing.  Ko- 
sslyn  (1989;  1994),  using  principles  adopted 
from  visual  information  processing  and  Good¬ 
man’s  (1978)  analysis  of  symbol  systems,  has 
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developed  a  set  of  prescript!  ves  for  graphic  de¬ 
sign,  based  on  an  analysis  of  the  syntax,  seman¬ 
tics,  and  pragmatics  underlying  graphs.  Pinker 
(1990)  provides  an  analysis  of  information  ex¬ 
traction  from  graphics  that  separates  processes 
involved  in  constructing  a  visual  description  of 
the  physical  aspects  of  the  graph  from  those 
involved  in  constructing  a  graph  schema  of  the 
mapping  of  the  physical  aspects  to  mathemati¬ 
cal  scales.  Carswell  and  Wickens  (Carswell, 
1992;  Carswell  &  Wickens,  1988;  1990)  have 
demonstrated  effects  of  perceptual  analysis  of 
integrality  on  graph  comprehension,  and  oth¬ 
ers  have  shown  biases  in  interpretation  or  mem¬ 
ory  dependent  on  graphic  displays  (Gattis  & 
Holyoak,  1996;  Levy,  Zacks,  Tversky,  & 
Schiano,  1996;  Schiano  &  Tversky,  1992;  Shah 
&  Carpenter,  1995;  Spence  &  Lewandowsky, 
1991;  Tversky  &  Schiano,  1989). 

SOME  FUNCTIONS  OF  GRAPfflC 
DISPLAYS 

I>espite  their  variability  of  form  and  con¬ 
text,  a  number  of  cognitive  principles  underlie 
graphic  displays.  These  are  evident  in  the  many 
functions  they  serve  as  well  as  in  the  way  infor¬ 
mation  is  conveyed  in  them.  Some  of  their  many 
overlapping  and  sometimes  conflicting  functions 
are  sketched  below.  As  with  functions,  goals,  and 
constraints  on  other  aspects  of  human  behavior, 
so  the  functions  of  graphic  displays  are  some¬ 
times  at  odds  with  each  other. 

Attract  attention  and  Interest.  One  prev¬ 
alent  function  of  graphic  displays  is  to  attract 
attention  and  interest.  As  such,  graphics  may 
be  pleasing  or  shocking  or  repulsive  or  calm¬ 
ing  or  funny. 

Models  of  actual  and  theoretical  worlds. 
Maps,  architectural  drawings,  molecules,  cir¬ 
cuit  diagrams,  organizational  charts,  flow  dia¬ 
grams  are  just  some  of  the  myriad  examples  of 
diagrams  serving  as  models  of  worlds  and  the 
things  in  them.  Note  that  these  are  models,  and 
not  strictly  shrunken  or  expanded  worlds.  Ef¬ 
fective  diagrams  omit  features  that  are  in  the 
modeled  world,  distort  others,  and  add  features 
that  are  not  in  the  modeled  world.  Maps,  for 


example,  may  exaggerate  the  sizes  of  streets  so 
that  they  can  be  seen.  They  introduce  symbolic 
elements,  for  railroads,  ocean  depth,  towns,  and 
more,  that  require  a  key  and/or  convention  to 
interpret.  The  essence  of  creating  an  effective 
extemalrepresentation  is  to  abstract  those  fea¬ 
tures  that  are  essential  and  to  eliminate  those 
that  are  not. 

Record  information.  An  ancient  function 
of  graphics  is  preserving  records.  Tallies,  for 
example,  were  devised  to  keep  track  of  proper¬ 
ty,  beginning  with  a  simple  one  mark  for  one 
item  relation,  developing  into  numerals  as  tal¬ 
lies  became  cumbersome  for  large  sums  and 
calculations  (Schmandt-Besserat,  1992). 

Facilitate  memory.  Facilitating  memory 
was  surely  was  and  is  one  of  the  functions  of 
writing,  whether  pictographic  or  alphabetic.  A 
contemporary  example  is  the  use  of  computer 
menus,  which  turn  a  recall  task  into  a  recogni¬ 
tion  one.  Graphical  user  interfaces  promote 
memory  in  another  way,  by  using  spatial  loca¬ 
tions  cues,  an  ancient  device,  the  Method  of 
Loci,  with  modem  support  (e.  g..  Bower,  1970; 
Franklin,  Tversky,  and  Coon,  1992;  Small, 
1997;  Taylor  and  Tversky,  1997;  Yates,  1969). 

Communication.  In  addition  to  facilitat¬ 
ing  memory,  graphic  displays  also  facilitate 
communication.  As  for  memory,  this  has  also 
been  an  important  function  of  writing,  to  allow 
communication  out  of  earshot  (or  eyeshot). 
Graphic  displays  allow  private,  mental  concep¬ 
tualizations  to  be  made  public,  where  they  can 
be  shared,  examined,  and  revised. 

Effective  graphics  make  it  easy  for  users  to 
extract  information  and  draw  inferences  from 
them.  Maps,  for  example,  facilitate  determining 
routes  and  estimating  distances.  A  map  of  chol¬ 
era  cases  in  London  during  an  epidemic  made  it 
easier  to  find  the  contaminated  water  pump 
(Wainer,  1992).  Plotting  change  rather  than  ab¬ 
solute  levels  of  a  measure  can  lead  to  very  dif¬ 
ferent  inferences  (Cleveland,  1985).  Indeed,  the 
advice  in  How  to  Lie  with  Statistics  (Huff,  1 954) 
has  been  used  for  good  or  bad  over  and  over. 
Miysics  diagrams  (Narayanan,  Suwa,  8l  Moto- 
da,  1994)  and  architectural  sketches  (Suwa  & 
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Tversky,  1996)  bias  users  towards  some  kinds 
of  inferences  more  readily  than  others. 

Graphic  displays  accomplish  all  these  func¬ 
tions  and  more  in  two  separable  ways,  through 
the  use  of  graphic  elements  or  icons,  and 
through  the  spatial  array  of  elements.  Different 
cognitive  principles  underlie  each.  In  general, 
graphic  elements  are  used  to  represent  elements 
in  the  world  and  graphic  space  is  used  to  repre¬ 
sent  the  relations  between  elements,  though 
there  are  exceptions  to  this  generalization.  This 
dichotomy  into  elements  and  relations  maps 
loosely  onto  the  “what”  vs.  “where”  distinction 
in  vision  and  in  spatial  cognition. 

The  fact  that  graphic  displays  are  external 
representation  devices  augments  many  of  their 
functions.  Spatially  organized  information  can 
be  accessed  and  integrated  quickly  and  easily, 
especially  when  the  spatial  organization  reflects 
conceptual  organization.  Several  people  can 
simultaneously  inspect  the  same  graphic  dis¬ 
play,  and  refer  to  it  by  pointing  and  other  de¬ 
vices  in  ways  apparent  to  all,  facilitating  group 
communication. 

ICONS:  FIGURES  OF  DEPICTION 

Sometimes  icons  can  be  used  to  represent 
meaning  directly,  for  example,  highway  Signs 
portraying  a  picnic  table  or  falling  rocks  to  in¬ 
dicate  the  presence  of  actual  ones.  “Figures  of 
depiction,”  analogous  to  figures  of  speech,  can 
be  used  to  portray  concepts  that  are  not  readily 
depicted  (Tversky,  1995).  One  common  type 
of  figure  of  depiction  is  metonymy,  where  an 
associated  object  represents  the  concept.  Re¬ 
turning  to  computer  interfaces,  a  picture  of  a 
folder  can  represent  a  file  of  words  and  a  pic¬ 
ture  of  a  trash  can  represent  a  place  for  unwant¬ 
ed  folders.  Analogous  examples  in  language 
include  using  “the  crown”  to  represent  the  king 
and  “the  White  House”  to  represent  the  presi¬ 
dent.  Synecdoche,  where  a  part  is  used  to  rep¬ 
resent  a  whole,  or  a  whole  for  a  part,  is  another 
common  figure  of  depiction.  In  highway  signs, 
an  icon  of  a  place  setting  near  a  freeway  exit 
indicates  a  nearby  restaurant  and  an  icon  of  a 
gas  pump  a  nearby  gas  station.  Analogous  ex¬ 


amples  in  language  include  “give  a  hand”  for 
help  and  “head  count”  for  number  of  people. 
These  same  figures  of  depiction  are  frequent  in 
icons  in  early  pictographic  writing  (Coulmas, 
1989;  Gelb,  1963;  Tversky,  1995).  For  exam¬ 
ple,  early  Sumerian  writing  used  a  foot  to  indi¬ 
cate  “to  go”  and  an  ox*s  head  to  indicate  an  ox. 
Children’s  spontaneous  writing  and  depictions 
also  illustrate  these  principles  (e.  g.,  Hughes, 
1986;  Levin  and  Tolchinsky-Landsman,  1989). 
Like  the  inventors  of  pictographic  languages, 
children  find  it  easier  to  depict  objects,  espe¬ 
cially  concrete  ones,  than  operations.  For  ab¬ 
stract  objects  and  operations,  children  use  me¬ 
tonymy  and  synecdoche.  For  example,  children 
draw  hands  or  legs  to  indicate  addition  or  sub¬ 
traction.  Interestingly,  the  latter  was  also  used 
in  hieroglyphics. 

The  meanings  of  these  depictions  are  some¬ 
what  transparent.  Often,  they  can  be  guessed, 
sometimes  with  help  of  context,  and  even  when 
guessing  is  not  likely,  they  are  easily  associat¬ 
ed  to  their  meanings,  and  thus  easily  remem¬ 
bered.  (for  similar  arguments  in  the  context  of 
ASL  and  gesture,  see  Macken,  Perry  and  Haas, 
1993).  Depictions  have  other  advantages  over 
words.  Meaning  is  extracted  from  pictures  faster 
than  from  words  (Smith  and  McGee,  1980). 
Icons  can  be  “read”  by  people  who  do  not  read 
the  local  language. 

A  new  use  of  depictions  has  appeared  in 
email,  emotions.  Seemingly  inspir^  by  smi¬ 
ley  faces,  and  probably  because  it  is  inherently 
more  casual  than  other  written  communication, 
computer  vernacular  has  added  signs  for  the 
emotional  expression  normally  conveyed  in 
face-to-face  communication  by  intonation  and 
gesture.  These  signs  combine  symbols  found 
on  keyboards  to  denote  facial  expressions,  usu¬ 
ally  turned  90  degrees,  such  as  :)  or ;). 

GRAPHIC  ARRAYS:  SPATIAL 
METAPHORS 

Graphs,  charts,  and  diagrams  convey  quali¬ 
tative  and  quantitative  information  using  natu¬ 
ral  correspondences  and  spatial  metaphors.  The 
most  basic  of  the  metaphors  is  proximity:  prox- 
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imity  in  space  is  used  to  indicate  proximity  on 
some  other  property,  such  as  time  or  value.  Spa¬ 
tial  arrays  convey  conceptual  information  meta¬ 
phorically  at  different  levels  of  precision,  corre¬ 
sponding  to  the  four  traditional  scale  types,  nom¬ 
inal,  ordinal,  interval,  and  ratio  (Stevens,  1946). 
These  are  ordered  inclusively  by  the  degree  of 
information  preserved  in  the  mapping.  Sponta¬ 
neously  produced  graphic  displays  reflect  these 
scale  types.  Children,  for  example,  represent 
nominal  relations  in  graphic  displays  at  an  earli¬ 
er  age  than  ordinal  relations,  and  ordinal  rela¬ 
tions  at  an  earlier  age  than  interval  relations 
(Tversky,  Kugelmass,  and  Winter,  1991). 

Nominal  scales  are  essentially  clusters  of 
elements  sharing  a  single  property  or  set  of  prop¬ 
erties.  Graphic  devices  indicating  nominal  rela¬ 
tions  often  use  the  simplest  form  of  proximity, 
grouping  (cf.  Gestalt  Principle  of  Grouping). 
Things  that  are  related  are  placed  contiguously 
or  in  close  proximity,  spatially  separated  from 
unrelated  things.  One  use  of  this  device  that  we 
take  for  granted  is  the  spaces  between  words  in 
writing.  In  early  writing,  there  were  no  spaces 
between  words.  Another  example  of  using  sepa¬ 
ration  in  space  to  indicate  separation  of  ideas  is 
indentation  and/or  spacing  for  paragraphs. 

A  list  provides  another  spatial  device  for 
delineating  a  category,  where  all  the  items  that 
need  to  be  purchased  or  tasks  that  need  to  be 
done  are  written  in  a  single  column.  Items  arc 
separated  by  empty  space,  and  the  items  begin 
at  the  same  point  in  each  row,  indicating  equiv¬ 
alence.  For  lists,  there  is  often  only  a  single 
category;  organization  into  a  column  indicates 
that  the  items  are  not  randomly  selected,  but 
rather,  share  a  property.  Multiple  lists  are  also 
common,  for  example,  the  list  of  chores  of  each 
housemate.  A  table  is  an  elaboration  of  a  list, 
using  the  same  spatial  device  to  organize  both 
rows  and  columns  (Stenning  and  Oberlander, 
1995).  Examples  include  a  list  of  countries  with 
their  GNP’s  for  each  of  the  last  ten  years,  or  a 
list  of  schools,  with  their  average  achievement 
scores  on  a  variety  of  tests.  Tables  cross-classi- 
fy.  Items  within  each  column  and  within  each 
row  are  related,  but  on  different  features.  For 

<5 


example,  columns  may  correspond  to  countries 
and  GNP’s  by  year,  or  to  schools  and  scores  by 
test,  and  rows  may  provide  the  values  for  each 
country  or  school.  Train  schedules  arc  yet  an¬ 
other  example,  where  the  first  column  is  typi¬ 
cally  the  stops  and  subsequent  columns  arc  the 
times  for  each  train.  For  train  schedules,  a  blank 
space  where  there  would  ordinarily  be  a  time 
indicates  a  non-event,  that  is,  this  train  doesn’t 
stop  at  that  station.  Using  spatially-arrayed  rows 
and  columns,  tables  group  and  juxtapose  si¬ 
multaneously. 

Special  signs,  usually  visual  ones  rather 
than  strictly  spatial  ones,  are  sometimes  used 
to  indicate  grouping.  These  seem  to  fall  into 
two  classes,  those  based  on  linking  or  enclo¬ 
sure  (cf.  Gestalt  Principle  of  Grouping)  and 
those  based  on  similarity  (cf.  Gestalt  Principle 
of  Similarity).  Many  signs  used  for  grouping 
resemble  physical  structures  that  enclose  things, 
such  as  bowls  and  fences,  or  physical  structures 
that  link  things,  such  as  paths.  Some  analogous 
structures  on  paper  are  lines,  parentheses,  cir¬ 
cles,  boxes,  and  frames.  Like  paths  or  out¬ 
stretched  arms,  lines  link  one  concept  to  an¬ 
other,  bringing  noncontiguous  things  into  con¬ 
tiguity,  making  distal  items  proximal.  In  tables, 

lines,  sometimes  whole  ( _ ),  sometimes 

partial  ( . )  (one  might  interpret  broken  lines 

as  more  tentative  than  solid  ones),  arc  used  to 
link  related  items.  Tables  often  add  boxes  to 
emphasize  the  structures  of  rows  and  columns 
or  to  enclose  related  items  and  separate  differ-  ’ 
ent  ones.  Newspapers  use  boxes  to  distinguish 
one  classified  ad  from  another.  Parentheses  and 
brackets  in  writing  are  in  essence  degenerate 
circles.  The  curved  or  bent  lines,  segments  of 
circles  or  rectangles,  face  each  other  to  enclose 
the  related  words  and  to  separate  them  from 
the  rest  of  the  sentence. 

Circles  indicating  items  belonging  to  the 
same  set  are  useful  in  visualizing  syllogisms 
and  in  promoting  inference  as  in  Euler  or  Venn 
diagrams  or  in  contemporary  extensions  of  them 
(e.  g..  Shin,  199f;  Stenning  and  Oberlander, 
15>95).  Circles  with  no  physical  contact  indi¬ 
cate  sets  with  no  common  items,  and  physical - 
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ly  overlapping  circles  indicate  sets  with  at  least 
some  common  items.  To  increase  the  inferen¬ 
tial  power  of  Euler  diagrams,  spatial  signs  based 
on  similarity  have  been  added,  such  as  filling 
in  similar  regions  with  similar  and  dissimilar 
regions  with  different  marks,  color,  shading, 
cross-hatching,  and  other  patterns  (e.  g.,  Shin, 
1991).  Maps  use  colors  as  well  as  lines  to  indi¬ 
cate  political  boundaries  and  geographic  fea¬ 
tures.  For  geographic  features,  many  of  the  cor¬ 
respondences  are  natural  ones.  For  example, 
deserts  are  colored  beige  whereas  forests  are 
colored  green,  and  lakes  and  seas  are  colored 
blue,  with  darker  (deeper)  blues  indicating 
deeper  water. 

Ordinal  relatiohs  can  vary  from  a  partial 
order,  where  one  or  more  elements  have  prece¬ 
dence  over  others,  to  a  complete  order,  where 
all  elements  are  ordered  with  respect  to  some 
property  or  properties.  There  are  two  separable 
issues  in  mapping  order  onto  space.  One  is  the 
devices  used  to  indicate  order,  and  the  other  is 
the  direction  of  order.  They  will  be  discussed 
in  order.  Writing  is  ordered,  so  one  of  the  sim¬ 
plest  spatial  devices  to  indicate  rank  on  some 
property  is  to  write  items  according  to  the  or¬ 
der  on  the  property,  for  example,  writing  coun¬ 
tries  in  order  of  GNP,  or  people  in  order  of  age. 
Degrees  of  empty  space  can  be  used  to  convey 
order,  is  in  progressive  indentation  in  outlines. 

Lines  can  be  used  to  indicate  order  as  well 
as  equivalence.  Lines  form  the  skeletons  of 
trees  and  graphs,  both  of  which  are  commonly 
us6d  to  display  ordered  concepts,  to  indicate 
asymmetry  on  a  variety  of  relations, 
including  kind  of,  part  of,  subservient  to,  and 
derived  from.  Examples  include  hierarchical 
displays,  as  in  linguistic  trees,  evolutionary 
trees,  and  organizational  charts.  Other  visu¬ 
al  and  spatial  devices  used  to  display 
order  rest  on  the  metaphor  of  salience.  More 
salient  elements  have  more  of  the  relevant 
property,  be  it  size,  ,  color,  highlighting,  or 
superposition.  Some  of  these  devices  rely  on 
what  can  be  called  natural  cognitive 
correspondences.  For  example,  high  temper¬ 
atures  are  associated  with  “warm”  colors  and 
low  temperatures  with  “cold”  colors,  as  used 


in  weather  maps  and  scientific  charts.  This 
association  most  likely  derives  from  the  col¬ 
ors  of  things  varying  in  temperature,  such  as 
fire  and  rce. 

Arrows  are  a  special  kind  of  line,  with 
one  end  marked,  inducing  an  asymmetry.  Al¬ 
though  they  have  many  uses,  a  primary  one 
is  to  indicate  direction,  an  asymmetric 
relation.  Arrows  seem  t6  be  based  oh  either 
or  both  of  two  sphtial  analogs.  One  obvious 
analog  is  the  projectile,  invented  by  many  dif¬ 
ferent  cultures  for  hunting.  It  is  not  the  hunt¬ 
ing  or  piercing  aspects  of  physical  arrows  that 
have  been  adopted  in  diagtams,  but  rather  the 
directionality.  Hunting  arrows  are 
asymmetric  as  a  consequence  of  which  they 
fly  more  easily  in  one  direction  than  the 
other.  Another  analog  is  the  idea  of  conver¬ 
gence  captured  by  the  >  (“V”)  of  a  diagram 
arrow.  Like  a  funnel  or  river  Straits,  it 
directs  anything  captured  by  the  wide  part  to 
the  point,  and  straight  outwards  from  there. 
Arrows  are  frequently  used  to  signal  direction 
in  space.  In  diagrams,  arrows  are  also  com¬ 
monly  used  to  indicate  direction  in  time.  In 
production  charts  and  computer  flow  dia¬ 
grams,  for  examples,  arrows  are" used  to  de¬ 
note  the  sequence  of  processes.  Terms  for 
time,  such  as  “before”  and  “after,”  and 
indeed  thinking  about  time,  frequently  derive 
from  terms  for  and  thinking  about  space  (e. 
g  ,  Clark,  1973). 

interval  and  ratio  relations  apply  more 
constraints  of  the  spatial  proximity  metaphor 
than  ordinal  relations.  In  graphic  displays  of 
interval  information,  the  spaces 
between  elements  are  meaningful;  that  is,  great¬ 
er  space  corresponds  to  more  on  the  relevant 
dimension.  This  is  not  the  case  for  ordinal  map¬ 
pings,  In  displays  of  ratio  information,  the  ratios 
of  the  spaces  are  meaningful. 

The  most  common  graphic  displays  of  in¬ 
terval  and  ratio  information  are  X-Y  plots, 
where  distance  in  the  display  corresponds  to 
distance  on  the  relevant  property  or 
properties.  Bar  charts  are  useful  for  displaying 
quantities  for  several  variables  at  once;  here, 
the  height  or  length  of  the  bar  corresponds  to 
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the  quantity  on  the  relevant  variable.  Isotypes 
combine  icons  and  bar  charts  to  render  quanti¬ 
ties  on  different  variables  more  readily  inter¬ 
pretable  (Neurath,  1936).  For  example,  in  or¬ 
der  to  display  the  yearly  productivity  by  sector 
for  a  number  of  countries,  a  unit  of  output  for 
each  sector  is  represented  by  an  isotype,  or 
icon  that  is  readily  interpretable,  a  shaft  of 
wheat  for  grain,  an  ingot  for  steel,  an  oil  well 
for  petroleum.  The  number  of  icons  per  sector 
is  proportional  to  output  in  that  sector. 
Icons  facilitate  comparison  across  countries  or 
years  for  the  same  sector.  Isotypes  were  invent¬ 
ed  by  Otto  and  Marie  Neurath  in  the  30’s  as  part 
of  a  larger  movement  to  increase  communica¬ 
tion  across  languages  and  cultures.  That  move¬ 
ment  included  efforts  to  develop  picture 
languages  and  Esperanto.  Musical  notation  is 
a  specialized  interval  scale  that  makes  use  of  a 
limited  visual  alphabet  corresponding  to 
modes  of  execution  of  notes  as  well  as  a  spa¬ 
tial  scale  corresponding  to  pitch.  Finally,  for 
displaying  ratio  information,  pie  charts  can  be 
useful,  where  the  area  of  the  pie  corresponds  to 
the  proportion  on  the  relevant  variable. 

DIRECTIONALITY 

In  spite  of  the  uncountable  number  of  pos¬ 
sibilities  for  indicating  order  in  graphic  dis¬ 
plays,  the  actual  choices  are  remarkably 
limited.  In  principle,  elements  could  be  ordered 
in  any  number  of  orientations  in  a  display.  Nev¬ 
ertheless.  graphic  displays  tend  to  order 
elements  either  vertically  or  horizontally  or 
both.  Similarly,  languages  are  written  either 
horizontally  or  vertically,  in  rows  or  in  columns. 
There  are  reasons  grounded  in  perception  for 
the  preference  for  vertical  and  horizontal  ori¬ 
entations.  The  perceptual  world  has  a  vertical 
axis  defined  by  gravity  and  by  all  the  things  on 
earth  correlate  with  gravity  and  a  horizontal 
axis  defined  by  the  horizon  and  by  all  the  things 
on  earth  parallel  to  it.  Vision  is  especially  acute 
along  the  vertical  and  horizontal  axes  (Howard, 
1982),  Memory  is  poorer  for  the  orientation  of 
oblique  lines,  and  slightly  oblique  lines  are  per¬ 
ceived  and  remembered  as  more  vertical 


or  horizontal  than  they  were  (Howard,  1982; 
Schiano  andTversky,  1992). 

Of  all  the  possible  orientations,  then,  graph¬ 
ic  displays  ordinarily  only  use  the  vertical  and 
horizontal.  What’s  more,  they  use  these  orienta¬ 
tions  differently.  Vertical  arrays  take  precedence 
over  horizontal  ones.  Just  as  for  the  choice  of 
dimensions,  the  precedence  of  the  vertical  is  also 
rooted  in  perception  (Clark,  1973;  Cooper  and 
Ross,  1975;  Lakoff  and  Johnson,  1980;  Frank¬ 
lin  and  Tversky,  1 990).  Gravity  is  correlated  with 
vertical,  and  people  are  oriented  vertically.  The 
vertical  axis  of  the  world  has  a  natural  asymme¬ 
try.  the  ground  and  the  sky.  whereas 
the  horizontal  axis  of  the  world  does  not.  The 
dominance  of  the  vertical  over  the  horizontal  Is 
reflected  in  the  dominance  of  columns  over  rows. 
It  is  more  usual  and  more  natural  to  make 
a  vertical  list  dian  a  horizontal  one.  Similarly, 
barcharts  typically  contain  vertical  columns. 

There  is  another  plausible  reason  for  the 
dominance  of  the  vertical  over  the  horizontal. 
Not  only  docs  the  vertical  take  precedence  over 
the  horizontal,  but  there  is  a  natural  direction 
of  correspondence  for  the  vertical,  though  not 
for  the  horizontal.  In  language,  concepts  like 
more  and  better  and  stronger  are  associated  with 
upward  direction,  and  concepts  like  less  and 
worse  and  weaker  with  downward  direction 
(Clark.  1973;  Cooper  and  Ross,  1975;  Lakoff 
and  Johnson,  1980).  People  and  plants,  indeed 
most  life  forms,  grow  upwards  as  they  mature, 
becoming  bigger.  stronger.  and 
(arguably)  better.  Healthy  and  happy  people 
stand  tall;  sick  or  sad  ones  droop  or  lie  down. 
More  of  any  quantity  makes  a  higher  pile.  The 
associations  of  up  with  quantity,  mood,  health, 
power,  status,  and  more  derive  from  physical 
correspondences  in  the  world.  It  is  no  accident 
that  in  most  bar  charts  and  X-Y  plots,  increases 
go  from  down  to  up.  The  association  of  all  good 
things  with  up  is  widely  reflected  in  language 
as  well  (inflation  and  unemployment  are  ex¬ 
ceptions,  but  principled  ones,  as  the 
numbers  used  to  convey  inflation  and  unem¬ 
ployment  go  up).  We  speak  of  someone  ”at  the 
top  of  the  heap,”  of  doing  the  “highest  good,” 
of  “feeling  up,”  of  being  “on  top  of  things,”  of 
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having  “high  status”  or  “high  ideals  ”  of  doing 
a  “top-notch  job  ”  of  reaching  “peak  perfor¬ 
mance,”  of  going  “above  and  beyond  the  line 
of  duty.”  In  gesture,  we  show  success  or  ap¬ 
proval  with  thumbs  up,  or  give  someone  a  con¬ 
gratulatory  high  five.  The  correspondence  of 
pitch  with  the  vertical  seems  to  rest  on  another 
natural  cognitive  correspondence.  We  produce 
higher  notes  at  higher  places  in  the  throat,  and 
lower  notes  at  lower  places.  It  Just  so 
happens  that  higher  notes  correspond  to  higher 
frequency  waves,  but  that  may  simply  be  a  hap¬ 
py  coincidence. 

In  contrast,  the  horizontal  axis  is 
standardly  used  for  neutral  dimensions,  for 
example,  time.  Similarly,  with  the  major  ex¬ 
ception  of  economics,  neutral  or  indepen¬ 
dent  variables  are  plotted  along  the  horizon¬ 
tal  axis,  and  the  variables  of  interest,  the  de¬ 
pendent  variables,  along  the  vertical 
axis.  Although  graphic  conventions  stipulate 
that  increases  plotted  horizontally  proceed 
from  left  to  right,  directionality  along 
the  horizontal  axis  does  not  seem  to  rest  in 
natural  correspondences.  The  world  is  asym¬ 
metric  along  the  vertical  axis,  but  not 
along  the  horizontal  axis.  Right-left  reflec¬ 
tions  of  pictures  are  hardly  noticed  but  top- 
bottom  reflections  are  (e.  g.,  Yin,  1969). 
Languages  are  just  as  likely  to  be  written  left 
to  write  as  right  to  left  (and  in  some  cases, 
both),  but  they  always  begin  at  the  top. 
Children  and  adults  from  cultures  where  lan¬ 
guage  is  written  left  to  right  as  well  as  from 
cultures  where  language  is  written  right 
to  left  mapped  increases  on  a  variety  of 
quantitative  variables  from  down  to  up,  but 
almost  never  mapped  increases  from  up  to 
down.  However,  people  from  both  writing 
cultures  mapped  increases  in  quantity  and 
preference  from  both  left  to  right  and  right 
to  left  equally  often.  The  relative  frequency 
of  using  each  direction  to  represent  quanti¬ 
tative  variables  did  not  depend  on  the  direc¬ 
tion  of  written  language  (Tversky,  et  al, 
1991),  Despite  the  fact  that  most  people  are 
right-handed  and  that  terms  like  dexterity 
derived  from  ’’right”  in  many  languages  have 


positive  connotations  and  terms  like  sinis¬ 
ter  derived  from  ’’left”  have  negative  con¬ 
notations,  the  horizontal  axis  in  graphic  dis¬ 
plays  seems  to  be  neutral.  Consistent  with 
that,  we  refer  to  one  side  of  an  issue  as  “on 
the  one  hand,”  and  the  other  side  as  “on  the 
other  hand,”  which  has  prompted  some 
politicians  to  ask  for  one-handed  advisors. 
And  in  politics,  both  the  right  and  the  left 
claim  the  moral  high  ground. 

Children’s  and  adults’  mappings  of  tem¬ 
poral  concepts  showed  a  different  pattern 
from  their  mappings  of  quantitative  and  pref¬ 
erence  concepts  (Tversky,  et  al,  1991),  For 
time,  they  not  only  preferred  to  use  the  hori¬ 
zontal  axis,  they  also  used  the  direction  of 
writing  to  determine  the  direction  of  tempo¬ 
ral  increases,  so  that  people  who  wrote  from 
left  to  right  tended  to  map  temporal  concepts 
from  left  to  right  and  people  who  wrote  from 
right  to  left  tended  to  map  temporal  concepts 
from  right  to  left.  This  pattern  of  findings  fits 
with  the  claim  that  neutral  concepts  such  as 
time  tend  to  be  mapped  onto  the  horizontal 
axis.  The  fact  that  the  direction  of  mapping 
time  corresponded  to  the  direction  of  writ¬ 
ing  but  the  direction  of  mapping  quantitative 
variables  did  not  may  be  because  temporal 
sequences  seem  to  be  incorporated  into  writ¬ 
ing  more  than  quantitative  concepts,  for  ex¬ 
ample,  in  schedules,  calendars,  invitations, 
and  announcements  of  meetings. 

Consistent  with  the  previous  arguments 
and  evidence,  ordinal  charts  and  networks 
tend  to  be  vertically  organized.  A  survey  of 
the  standard  scientific  charts  in  all  the  text¬ 
books  in  biology,  geology,  and  linguistics  at 
the  Stanford  Undergraduate  Library  revealed 
vertical  organization  in  all  but  two  of  48 
charts  (Tversky,  1995).  Furthermore,  within 
each  type  of  chart,  there  was  agreement  as 
to  what  appeared  at  the  top.  In  17  out  of  the 
18  evolutionary  charts.  Homo  sapiens,  that 
is,  the  present  age,  was  at  the  top.  In  15  out 
of  the  16  geological  charts,  the  present  era 
was  at  the  top,  and  in  13  out  of  the  14 
linguistic  trees,  the  proto-language  was  at  the 
top.  In  these  charts,  in  contrast  to  X-Y  graphs. 
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time  runs  vertically,  but  time  does  not  seem 
to  account  for  the  direction,  partly  because 
time  is  not  ordered  consistently  across  the 
charts.  Rather,  at  the  top  of  each  chart  is  an 
ideal.  In  the  case  of  evolution,  it  is  human¬ 
kind,  regarded  by  some  as  the  pinnacle  of 
evolution,  a  view  some  biologists  discourage. 
In  the  case  of  geology,  the  top  is  the  richness 
and  accessibility  of  the  present  era.  In  the 
case  of  language  trees,  the  top  is  the  proto- 
language,  the  most  ancient  theoretical  case, 
the  origin  from  which  others  diverged.  In  or¬ 
ganizational  charts,  say  of  the  government  or 
large  corporations,  power  and  control  are  at 
the  top.  For  diagramming  sentences  or  the 
human  body,  the  whole  is  at  the  top,  and  parts 
and  sub-parts  occupy  lower  levels.  In  charts 
such  as  these,  the  vertical  relations  are  mean¬ 
ingful,  denoting  an  asymmetry  on  the  mapped 
relation,  but  the  horizontal  relations  are  of¬ 
ten  arbitrary. 

BASIS  FOR  METAPHORS  AND 

COGNITIVE  CORRESPONDENCES 

A  major  purpose  of  graphic  displays  is 
to  represent  visually  concepts  and  relations 
that  are  not  inherently  visual.  Graphic  dis¬ 
plays  use  representations  of  elements,  prima¬ 
rily  icons,  and  the  spatial  relations  among 
them  to  do  so.  To  enhance  communication, 
both  elements  and  relations  are  based 
on  people’s  perception  of  and  interaction 
with  the  familiar  physical  world,  especially 
the  spatial  world.  People  have  extensive  ex¬ 
perience  observing  and  interacting  with  the 
physical  world,  and  consequently  extensive 
knowledge  about  the  appearance  and  behav¬ 
ior  of  things  in  it.  It  is  natural  for  this 
concrete  experience  and  knowledge  to  serve 
as  a  basis  for  pictorial,  verbal,  and  gestural 
expression. 

Naturalness  is  found  in  natural  correspon¬ 
dences,  "figures  of  depiction,”  and  spatial  met¬ 
aphors,  derived  from  extensive 
human  experience  with  the  concrete  world.  It 
is  revealed  in  language  and  in  gesture  as  well 
as  in  a  long  history  of  depictions. 
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ABSTRACT 

In  this  paper,  we  argue  that  accounts  of  anal¬ 
ogy  should  be  consistent  with  the  theoretical 
frameworks  developed  for  related  cognitive  pro¬ 
cesses,  such  as  induction.  On  one  hand,  this  al¬ 
lows  to  more  firmly  anchor  our  theoretical  per¬ 
spectives  on  analogy,  and,  on  the  other  hand, 
this  may  offer  ways  to  improve  on  the  current 
theories  in  the  related  fields.  We  propose  some 
steps  towards  these  goals. 

1.  INTRODUCTION 

The  study  of  analogy  confronts  us  with  a 
formidable  challenge.  Its  manifestations  are 
seemingly  ubiquitous  :  from  perceptual  process¬ 
es  responsible  for  recognizing  concepts  in  “raw 
data”,  to  categorization  relying  on  perceived 
similarity,  up  to  “higher”  cognitive  processes 
including  communication  through  metaphors  or 
creativity.  It  is  definitively  not  to  be  ignored. 
But  at  the  same  time  it  is  very  difficult  to  study. 

First  of  all,  thanks  to  its  multifarious  as¬ 
pects,  it  tends  to  be  a  slippery  and  hard  to  de¬ 
limit  notion.  Many  works  (Indurkhya,  89)  have 
made  proposals  to  distinguish  several  types  of 
analogies,  emphasizing  differences  in  purpos¬ 
es,  a  priori  information  and  underlying  process¬ 
es.  If  some  clarification  results,  at  the  price  of 
complication,  it  remains  to  define  precisely  in 
each  case  both  the  goal  of  analogy  (and  the  at¬ 
tached  performance  criteria)  and  the  mecha¬ 
nisms  involved. 

Second,  analogical  reasoning  is  an  unjusti¬ 
fiable  (i.e.  not  logically  valid)  inference  proce¬ 
dure.  It  goes  beyond  the  deductive  closure  of 
the  initial  information  and  therefore  cannot  of¬ 


fer  any  warranty  on  its  conclusions.  But  then 
what  supports  analogies  ?  What  makes  an  anal¬ 
ogy  better  than  another  one  ?  More  concretely, 
why  is  it  that  it  is  so  much  used,  apparently  to 
the  benefit  of  reasoning  agents  (as  sanctioned 
by  Evolution)  ?  Again,  we  encounter  the  prob¬ 
lem  of  the  evaluation  criteria.  More  basically, 
the  difficulty  lies  in  the  lack  of  firm  referential 
system  upon  which  to  build  and  evaluate  theo¬ 
ries  and  models  of  analogy. 

Responses  to  these  problems  have  been 
twofold.  One  has  been  to  seek  some  normative 
characterization  of  analogical  reasoning  where¬ 
by  necessary  conditions  for  sound  inferencing 
are  stated  (Russell,  1987).  Unfortunately  this 
interesting  approach  so  far  has  delivered  very 
restrictive  conditions  that  in  effect  exclude 
much  of  the  subject  matter.  The  other  approach 
takes  natural  reasoning  agents,  prominently 
human  ones,  as  standards  against  which  to  mea¬ 
sure  the  quality  of  analogies  and  of  the  mecha¬ 
nisms  that  produce  them.  But  of  course,  these 
natural  yardsticks  are  subject  to  many  parame¬ 
ters  (perceived  context,  implicit  goals,  cultural 
background  and  so  forth)  that  are  impossible 
to  securely  control.  Therefore  this  opens  the 
door  for  endless  arguments  about  the  relevance 
and  validity  of  each  new  experiment,  and  con¬ 
sequently  of  the  tested  models. 

It  is  noteworthy  that  in  this  context,  what 
is  evaluated  are  not  so  much  the  end  results  of 
analogical  inferencing,  but  rather  the  process¬ 
es  that  are  assumed  to  play  a  key  role  in  their 
production.  For  instance,  once  it  has  been  hy¬ 
pothesized  that  similarity  judgments  are  at  the 
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core  of  analogical  reasoning  (and  many  other 
cognitive  processes  as  well),  theories,  mod¬ 
els,  and  arguments  center  on  similarity  mea¬ 
surements  and  what  they  involve,  in’  effect 
evacuating  the  fundamental  question  of  why 
a  high  degree  of  similarity  between  a  source 
case  and  a  target  case  should  entail  highly  re¬ 
liable  transfers  of  information  from  one  to  the 
other  (leaving  aside  both  the  important  issue 
regarding  the  objective  nature  of  similarity 
(Medin  et  al.,1993)  and  the  question  of  the 
modus  operand!  of  these  transfers). 

This  overall  situation :  a  subject  matter  con¬ 
cerned  with  an  inferencing  process  both  present¬ 
ing  seemingly  many  different  facets  and  mani¬ 
festations,  and  inherently  lacking  sound  justifi¬ 
cation,  is  reminiscent  of  the  situation  faced  by 
the  students  of  induction  ten  to  fifteen  years  back. 
There  also,  there  were  plenty  of  models  for  in¬ 
ductive  reasoning  that  were  assessed  on  the  face 
of  their  measured  performance  on  chosen  bench¬ 
marks,  and  a  corresponding  need  for  an  estab¬ 
lished  theory.  The  situation  has  changed  recent¬ 
ly  (mostly  thanks  to  Vapnik  (1995),  Valiant  and 
many  brilliant  co-workers  of  the  COLT  (Com¬ 
putational  Learning  Theory)  community). 

This  apparent  aside  on  induction  points  out 
a  third  potential  way  of  approaching  analogical 
reasoning.  Since  it  is  supposed,  rightly,  that  it  is 
a  core  component  of  many  cognitive  processes, 
it  should  not  be  an  isolated  point  with  regards  to 
its  internal  working  and  its  performance  crite¬ 
ria.  In  other  words,  properties  and  principles 
uncovered  in  studying  other  fundamental  cog¬ 
nitive  processes  should  hardly  be  expected  not 
to  be  shared,  at  least  in  part,  with  analogical  rea¬ 
soning.  Consequently,  any  theory  and  model  of 
analogy  should  be  consistent  with  theories  and 
models  for  other,  related,  faculties.  This  could, 
and  should,  provide  for  good  anchor  points  on 
which  to  erect  models  of  analogy. 

This  is  indeed  the  track  that  we  take  in  this 
paper.  In  a  way,  we  are  pursuing  a  very  ambi¬ 
tious  goal,  that  of  uncovering  some  fundamen¬ 
tal  traits  that  would  constitute  the  basis  for  an 
overall  theory  that  would  encompass  several 
cognitive  faculties,  including  of  course  analo¬ 
gy  making.  We  propose  not  to  find  justifica¬ 


tions  for  analogical  inferencing,  an  hopeless 
pursuit,  nor  to  assess  the  value  of  one’s  model 
by  compari.son  with  natural  reasoning  agents, 
something  necessary  but  not  sufficient  and  nev¬ 
er  to  be  completely  satisfactory  nor  convinc¬ 
ing,  but  to  present  a  theory  of  analogical  rea¬ 
soning  that  both  satisfies  a  reasonable  criterion 
for  analogy,  and  at  the  same  time  is  consistent 
with  existing  theories  of  inductive  learning,  a 
process  that  we  argue  is  intimately  related  to 
analogical  inferencing. 

This  paper  presents  the  current  state  of  this 
endeavor.  Section  2  argues  that  analogical  rea¬ 
soning  and  induction  are  intimately  connected 
while  at  the  same  time  being  different  in  im¬ 
portant  aspects.  It  also  sums  up  the  current  state 
of  accepted  theories  of  induction.  In  section  3, 
we  present  our  own  model  of  analogy,  show¬ 
ing  in  which  respects  it  is  intuitively  appealing 
and  how  it  maintains  closed  links  with  theories 
of  inductive  learning.  Section  4  demonstrates 
on  a  canonical  example  that  the  model  yields 
realistic  results.  Finally,  section  5  sums  up  the 
state  of  this  project  and  points  to  directions  for 
future  research. 

2.  ANALOGY  AND  INDUCTION  : 

RESEMBLANCE’S  AND 
DISSIMILARITIES 

Deeply  rooted  in  analogy  surely  rests  the 
notion  of  similarity.  At  the  least,  analogy  in¬ 
duces  similarity,  sometimes  totally  unexpect¬ 
edly,  as  in  creative  analogy.  The  objective  na¬ 
ture  of  similarity  is  the  object  of  active  debate 
within  psychological  circles  (Medin  et  al. 
1993),  but  it  undoubtedly  underlies  categori¬ 
zation  too  :  similar  things  tend  to  be  grouped 
together  in  cognition.  Analogical  reasoning  also 
shares  many  common  points  with  induction,  as 
we  see  now. 

2.JA  view  on  inductive  horning  and  its 
theory 

Figure  1  provides  a  flavor  of  what  we  are 
up  to  in  inductive  learning.  A  collection  of  ex¬ 
amples,  the  learning  set,  is  given,  consisting  of 
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Figure  1.  Inductive  learning  (in  the  supervised  setting), 
consists  in  identifying  a  function  f  that  **explaint”  the 
learning  data  (set  of  pairs  (Xff(x))  and  making  the 
inference  that  the  tame  f  applies  in  unseen  instances. 

pairs  (x.,f(x.)),  and  the  goal  is  to  infer  what  val¬ 
ue  would  take  the  hypothetical  and  unknown 
function  f  on  new  points  x..  Generally,  there  is 
a  cost  associated  with  errors  on  f(x.),  also  called 
the  risk,  so  that  inductive  learning  consists  in 
finding  an  hypothesis  h  such  that  the  risk  aver¬ 
aged’  over  the  space  of  all  possible  instances, 
or  the  expected  risk,  be  minimal. 

Before  the  large  diffusion  of  theoretical 
studies  of  induction  (Vapnik,1995),  the  com¬ 
mon  view  was  that  the  obvious  learning  strate¬ 
gy  was  to  select  an  hypothesis  minimizing  the 
risk  over  the  learning  set,  called  the  empirical 
risk  since  it  is  measurable,  in  order  to  automat¬ 
ically  get  the  optimal  hypothesis  with  respect 
to  the  expected  risk  (one  that  by  nature  is  un¬ 
known).  This  belief  has  been  formalized  and 
given  a  name  :  the  Empirical  Risk  Minimiza¬ 
tion  principle  (ERM  for  short).  In  essence,  what 
this  principle  states  is  that  the  best  account  for 
the  learning  instances  is  ipso  facto  the  best  one 
also  for  yet  to  be  observed  events.  Vapnik,  and 
many  other  theorists  in  the  last  fifteen  years, 
have  disproved  this  naive  view. 

Of  course,  the  philosophers  knew  this  all 
along.  There  cannot  be  any  miraculous  basis 
for  inducing  general  laws  from  specific  ob¬ 
servations.  But  theorists  of  inductive  learn¬ 
ing  have  gone  further,  specifying  sufficient 


*  More  precisely,  the  averaging  is  weighted  by  the 
distribution  over  the  instance  ^pace,  so  that  more  weight  is 
given  to  dense  areas,  where  it  is  more  likely  to  encounter 
future  events. 


Figure  2.  The  best  model  for  the  data  points  is  deemed 
to  he  the  One  that  is  at  the  same  time  *^simple”  and  fits 
well  to  the  data.  Here,  the  linear  model  is  simpler  to 
specify  than  the  polynomial  one,  and  seems  to  fit  equally 
well  (or  bad  ?)  the  data  points.  Hence,  following  the 
MDLp,  it  should  be  preferred. 

conditions  for  induction  to  be  a  reliable 
source  of  inferences.  Sketched  in  broad  lines, 
the  now  “classical”  theory  of  induction  states 
that  induction  is  possible  and  reliable  in  pro¬ 
portion  that  the  set  of  potential  candidate 
hypotheses  considered  by  the  learner  is  re¬ 
stricted^  .  In  other  words,  a  learner  that  is  able 
to  explain  any  data  set  is  hence  unable  to 
make  induction,  while  a  learner  that  can  only 
consider  severely  restricted  classes  of  con¬ 
cepts,  if  with  these  it  may  explain  the  ob¬ 
served  data  (available  in  sufficient  quantity), 
is  justified  to  generalize  to  other,  as  yet  un¬ 
known,  cases.  Given  that  there  is  no  “free 
lunch”,  the  problem  is  now  to  chose  a  priori 
the  right  set  of  hypotheses. 

It  is  noteworthy  that,  according  to  these 
theories,  the  confidence  that  one  may  put  in 
inductive  learning  only  depends  on  statistical 
quantities  characterizing  the  hypothesis  set  tak¬ 
en  as  a  whole,  as  well  as  the  distribution  and 
the  number  of  learning  instances. 

Other  theoretical  approaches  to  inductive 
learning  share  this  property:  These  are  the  baye- 
sian  perspective  on  learning  and  the  related 
Minimum  Description  Length  principle 
(MDLp).  Roughly,  they  prescribe  to  select  the 
hypothesis  which  is  maximally  probable  given 
the  observed  data  and  their  a  priori  probabili- 


*  Technically,  these  restrictions  concern  the  possible 
partitions  of  the  instance  space  that  are  induced  by  the  hy¬ 
pothesis  set.  They  are  measured  via  statistical  quantities, 
the  most  famous  one  being  the  Vapnik-Chervonenkis  di¬ 
mension. 
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ties  (something  that  is  easily  computed  with 
Bayes  formula).  The  MDL  view  replaces  this 
principle  by  one  where  one  should  chose  the 
hypothesis  such  that  the  sum  of  its  code  length 
(within  some  well  chosen  coding  schema)  and 
the  length  of  the  description  of  the  data  encod¬ 
ed  with  the  hypothesis  be  minimal  (figure  2  il¬ 
lustrates  this).  It  is  a  remarkable  fact  that  it  can 
be  proved  that  the  Vapnik  theory  and  the  MDL 
principle,  starting  from  widely  different  pre¬ 
mises,  are  nonetheless  tightly  linked.  A  fact  that 
reinforces  the  confidence  in  these  theories. 

This  is  all  good  and  well,  but  does  it  have 
something  to  do  with  analogy  ? 

2,2  The  same,  yet  different 

As  already  noted,  there  are  several  types 
of  analogies.  Some  involve  the  comparison  of 
two  given  items  (e.g.  “abc”  and  “1 22333”),  and 
some  the  completion  of  one  item  given  the  oth¬ 
er  (e.g.  if  “abc  abd”,  what  should  be  the 
completion  of  “aababc  ?”).  This  last  case 
(due  to  Hofstadter  and  his  co-workers  (Mitch¬ 
ell,  1993),  (Hofstadter,  1995))  is  a  tricky  one. 
We  do  not  mean  here  that  it  might  be  difficult 
for  the  reader  to  infer  the  completion  “aabab- 
cd”,  but  that  this  is  just  a  good  example  where 
one  is  made  aware  of  the  fact  that  much  more 
has  to  be  inferred.  Indeed,  nothing  is  given 
about  the  ways  the  strings  (are  they  really  ?) 
should  be  perceived,  nor  about  the  dependence 
relationship  between  “abc”  and  “abd”  in  the 


X 


y  1 

y'=  ? 


Figure  3.  One  view  of  analogy  making  enhances  its 
inherent  inferential  aspect  from  limited  information. 
Only  X,  f(x)f  and  x*  are  known  to  the  agent  From  these 
**raw  data**,  the  agent  must  infer  their  Interpretation, 
the  dependence  relation  f  in  the  source,  and  the 
corresponding  **transported**  dependence  relation  f  in 
the  target  From  this  follows  y*. 


source  case.  Worse  yet,  the  perception  and 
interpretation  of  the  source  depends  on  the 
target  probe.  Had  the  last  one  be  here  “Amer¬ 
ican  Broadcasting  Corporation  ?”,  that  the 
source  “abc  abd”  would  have  been  thought 
of  completely  differently.  It  is  therefore  evi¬ 
dent  that  this  type  of  analogy  encompasses  the 
former  one  where  no  completion,  other  than 
completion  of  interpretation,  takes  place.  This 
is  why  we  will  consider  this  one  type  here. 

If  now,  we  take  a  look  at  figure  3,  it  may 
strike  us  that  analogy  is  but  a  limit  case  of  in¬ 
duction  where  one  has  access  only  to  one  learn¬ 
ing  instance.  Under  this  perspective,  analogy 
and  induction  are  the  same.  And  this  is  why  we 
argue  that  surely  their  respecti  ve  theories  should 
be  consistent  so  that  they  merge  in  between 
where  very  few  learning  instances  are  available. 

On  the  other  hand,  there  exist  significant 
differences  that  make  problematic  the  simple 
extension  of  the  classical  theories  of  induction 
to  analogy,  but  also,  as  we  will  sec,  offer  the 
perspective  of  refining  these  existing  theories 
beyond  their  current  state.  Here  is  a  list  of  these 
differences. 

•  The  prediction  is  to  be  performed  on  one 
point  only,  not  on  the  whole  potential  in¬ 
stance  space.  The  notion  of  expected  risk 
is  therefore  undermined  to  say  the  least. 

•  Each  item  potentially  has  its  own  referen¬ 
tial  frame  (as  in  “abc  abd”;  “122333 
?”,  or  better  yet  in  “abc  — >  abd”;  “Ameri¬ 
can  Broadcasting  Corporation  ?”).  This 
is  in  contrast  to  the  unstated  assumption  in 
induction  that  the  looked  for  hypothesis  f 
is  the  same  all  over  the  instance  space. 

•  The  target  plays  an  important  role  in  anal¬ 
ogy,  shaping  the  interpretation  of  the 
source,  while  it  does  not  intervene  in  any 
ways  in  existing  theories  of  induction. 

•  Finally,  may  be  as  a  consequence  of  the 
above  points,  it  is  strongly  believed  that  the 
“distance”  between  the  source  and  the  tar¬ 
get  plays  a  key  role  in  analogy.  In  contrast, 
there  is  no  notion  of  distance  between  in¬ 
stances  in  inductive  learning^ . 
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To  sum  up  at  this  point.  We  believe  that  the 
study  of  analogy  should  deliberately  take  into 
account  related  cognitive  processes,  such  as  cat¬ 
egorization  and  induction,  and  try  to  make  con¬ 
tact  with  the  theories  therein.  This  would  more 
firmly  anchor  tentative  theories  and  models  for 
analogy.  At  the  same  time,  developing  theories 
adapted  to  the  specific  demands  of  analogy  of¬ 
fers  the  perspective  to  refine  the  theories  of  the 
related  cognitive  process.  To  be  more  specific, 
incorporating  the  notion  of  distance  between 
instances,  and/or  of  local  referential  frames,  into 
the  theory  of  inductive  learning,  in  needed  har- 
mohy  with  theories  of  analogy,  should  result  in 
finer  theories  of  induction.  Theories  that,  for  in¬ 
stance,  would  better  predict  which  amount  of 
information  is  needed  in  order  to  be  able  to  learn, 
say,  some  classes  of  concepts. 

This  is  in  accordance  with  this  philosophi¬ 
cal  outlook  that  we  have  undertaken  to  look  for 
a  theory  of  analogy,  one  that  would  be  faithful 
to  the  phenomena,  and  be  related  to  theories  of 
inductive  learning. 

3.  A  PROPOSAL 

Let  not  be  misled  here,  we  are  not,  at  this 
point,  looking  for  the  specification  of  a  reason¬ 
ing  mechanism  that  would  be  a  candidate  for 
modeling  analogy  making,  but  we  aspire  to  find 
a  criterion  for  evaluating  candidate  analogies, 
a  criterion  that  the  best  analogy  should  opti¬ 
mize.  Recalling  figure  3,  it  is  clear  that  this  cri¬ 
terion  must  depend  on  what  is  known  to  the 
reasoning  agent,  i.e.  the  source  :  x  and  f(x)  (in 
the  best  of  case  including  f  itself),  and  the  in¬ 
complete  target :  x’.  It  should  also  depend  on 
prior  knowledge  which  is  the  basis  for  the  in¬ 
terpretation  of  the  situations. 

In  addition  to  this,  and  following  our  poli¬ 
cy,  we  should  find  a  criterion  that  is  consistent 
with  the  theoiy  of  induction.  In  particular,  this 
criterion  should  take  into  account  the  “entro¬ 


*  Beware  not  to  confuse  the  notion  of  distance  between 
instances,  as  in  analogy,  and  between  hypotheses  or  an  in¬ 
stance  and  an  hypothesis,  as  can  be  the  case  in  induction. 


py”  of  the  candidate  hypotheses  space,  or,  more 
intuitively,  of  the  complexity  of  the  candidate 
hypotheses.  The  idea  being  that  the  more  un¬ 
derlying  regularities  are  discovered  in  the  data, 
the  more  its  expression  can  be  compressed.  The 
MDLp  is  one  expression  of  this  general  doc¬ 
trine.  We  should  therefore  look  for  a  measure 
of  parsimony.  The  best  analogy  should  corre¬ 
spond  to  the  discovery  of  regularities  both  in 
the  source  and  the  target,  regularities  that  should 
be  as  interrelated  as  possible,  This  last  point 
being  in  agreement  with  a  third  desiderata  :  that 
the  evaluation  criterion  reflects  in  some  way 
our  anticipation  that  analogy  is  linked  to  a  no¬ 
tion  of  perceived  similarity  or  distance  between 
the  analogs. 

An  evaluation  criterion  for  analogy 

In  figure  4,  we  show  how  a  version  of  the 
MDLp  could  be  adapted  to  analogy.  The  best 
analogy  should  be  the  one  that  minimize  the 
cost  of  the  models  or  interpretations  on  which 
are  based  the  perception  of  (x,  f(x))  on  the  one 
hand,  and,  on  the  other  hand,  of  x’,  while  at  the 
same  time  minimizing  the  cost  of  translating 
the  interpretation  of  the  source  to  the  interpre¬ 
tation  of  the  target.  This  is  what  is  expressed  in 
the  following  proposition. 

Given  M^,  and  f,  it  is  easy  to  derive  f^ 
by  f^  =  pgm^^j  ^j^(f),  that  is  the  transformation 
of  the  expression  of  f  within  the  referential  as¬ 
sociated  with  Mj  by  the  program  that  transforms 
referential  to  referential  M.^.  Then  f^(x*)  may 
be  computed. 


Figure  4. 
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Sourse  Target 


Figure  5.  Following  the  theory  presented  here,  nny  analogy  involves  interpretations  or  models,  constructed  from  prior 
knowledge  (the  domain  theory)  that  are  local  to  the  source  :  and  to  the  target :  From  these,  the  specifics  of 

each  case  can  easily  be  reconstructed.  At  the  same  time,  analogy  making  implies  that  a  relationship  he  identified 
between  and  such  that  the  two  seem  similar  to  each  other.  We  submit  that  the  best  analogy  is  the  one  that 
minimizes  the  overall  cost  of  specifying  the  models,  there  rtlr^onship  (from  to  Mj)  and  the  derivation  of  the 

specifics  of  each  case. 


Proposition : 

The  set  of  models  and  descriptions  M^,  M^,  x.  f,  x*  that 
minimizes  the  formula^  : 

TotaLIcngth  =  L(M^)  +  L(xlM^)  +  L(nM^)  +  + 

L(x'IM^)  is  the  one  associated  with  the  best  analogy  between 
the  source  and  the  target. 

4.  ILLUSTRATION 

This  section  intends  to  illustrate  the  above 
conceptualization.  It  is  not  meant  to  demonstrate 
its  value  as  a  model  of  the  human  ability  in  mak¬ 
ing  analogies.  This  is  beyond  the  scope  of  this 
short  paper,  and  would  require  a  careful  discus¬ 
sion  of  representation  primitives,  suitable  cod¬ 
ing  system,  and  hypothesized  prior  knowledge. 


^  L  is  taken  to  be  a  function  measuring  the  cost  or  length 
of  its  argument  expressed  in  bits.  We  do  not  dwcivc  here  in 
technical  details  about  what  that  involves.  Wc  refer  the  reader 
to  (Li  &  Vitanyi,1993)  for  a  thorough  introduction  to  algo¬ 
rithmic  complexity  theory  on  which  our  model  is  based 


4.1  The  domain 

In  order  to  keep  things  manageable,  we 
have  chosen  a  domain  where  it  is  easy  to  de¬ 
fine  representation  primitives  and  theories,  and 
yet  which  presents  enough  richness  to  be  de¬ 
monstrative  of  the  wealth  of  issues  in  analogy¬ 
making.  This  domain  is  inspired  from  the  mi¬ 
croworld  developed  by  Hofstadlcr  ct  al.  for  the 
COPYCAT  project  (Mitchell,  1993). 

The  basic  objects  in  this  world  are  the  26 
letters  of  the  alphabet,  but  it  would  be  straight¬ 
forward  to  add  numbers  or  geometrical  shapes. 
The  task  consists  in  finding  how  a  letter  string  is 
transformed  given,  as  an  example,  another  string 
and  its  transform.  For  instance,  given  that  abc 
c>  abd  (the  source),  what  becomes  of  iijtjkk  => 
?  (the  target).  The  problem,  quite  familiar  in  IQ 
like  tests,  is  thus  to  identify  the  relevant  aspects 
and  transformation  at  work  in  the  source  that  can 
best  be  mapped  to  the  target  problem.  It  is  very 
easy  to  make  up  a  whole  variety  of  problems 
that  test  the  range  of  analogy-making. 
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•  Features  describing  the  conceptual  structures  : 

-orientation  (->/<-)  t  .  ‘  1  bit 

-  cardinality  or  number  of  elements  :  n  ■'  log2(n)  -h  1  bits 

-  length  :  1  '  ,  ^  t  ?  logzd)  +  1  bits 

-  starting  or  ending  with  element  =  x  i,  ,  ^  bits 

•  Letter  ...  -  (i/2} 

Particulair  letter  (e.g.  'd')  ^  (1/2.26) 

•  String  {orientation,  eluents)  (i/S) 

L  =  3  +  L (orientation)  +  _  L (elements)  ..  . 

e.g.  L('a3bd'  with  orientation  =  ->)  =3+1  +  log2 ( (1/2 . 26) ^  +  L(3) 

=3+1  +  18  +3  =  25  bits 

•  Seq^ence  (orientation,  .  type  of  elements,  succession  law,  '  length, 

V  starting  or  ending  with)  .  ^  <'  (1/8) 

L  -  3+  L(o2rient.)  '+  L(type)  +  L(law)+  L (length)  +  L(start/end) 

•  Description  and  length  of  a  succession-law  '  ^ 

succ ( type-of-ei . ,n,x)  _  therith  successor  of  the'elt.  x  of  type-of-el. 

L  =  L(type)  +  L(n  (see  below))  +  L(x}  :  | 

L(n)  =  L(i/6)  if  n=l  or  -1  (first  successor  or  predecessor) 

L(1/3)  if  n=0  (same  element)  '  ' 

,1// /I /3;  . /I otherwise  (with  p=n  if  rt*0'  p=-n  otherwise) 

•  First  /  last  -  ^ 

•  nth  •  ,  ,  ^  n  'bits 


Table  /.  List  of  some  representation  primitives  with  their  associated  description  length  either  in  bits  or  defined  as 

probabilities. 

Hence,  the  string  abc  could  be  described  as:  ceptual  structures  that  allow  to  describe  and 

'abc'  String  (1/8)  highlight  various  aspects  of  the  situations  at 

orientation  :  ->  (1/2)  hand  (see  table  1 ).  In  order  for  the  quality  cri- 

lst=’A',  2nd='B\  3rd='C’  (1/4.26)^  terion  to  be  computable,  each  construct  is  as- 

TOTAL  Length  :  21  bits  sociated  with  a  liurrtber,  that  corresponds  either 

to  a  prior  probability  from  which  it  is  easy  to 
draw  the  related  length  using  the  relation  L=- 
log2(P)  (e.g.  the  concept  of  string  is  associated 
with  the  prior  1/8,  hence  is  of  length  3  bits),  or 
directly  to  a  length  in  bits  (e.g.  the  concept  of 
nth^fequireS  n  bits).  These  numbers  can  be 
modified  either  manually  or  through  learning 
to  yield  various  biases  dorrespohding  to  a  vari¬ 
ety  of  contexts  or  prior  knowledge. 

It  is  clear  that  the  last  description,"  which 
more  fully  represents  the  structure  of  the  string 
abc,  is  the  most  economical  ohe,  even  though 
it  describes  it  more  completely  than  the  first 
description  which  corresprinds  to  the  percep¬ 
tion  of  a  set  of  three  letters. 


or  else  as : 

'abc*_  Sequence  (1/8)  . 

orientation  :  ->  (1/2) 

type  of  elements  =  letters  (1/2) 

succession-law : 

succ(elt(letter=x)  =  elt(succ(letter,l,x)) 
L(letter)  +  L(1  st  siicc)  +  L(x) 

=  L(l/2.1/6.1)  =  4bits 

length  =  3  3  bits 

starting  with  element(letter=' A')  (1/26) 
TOTAL  Length  :  17  bits 

Following  (Mitchell,  1993),  the  back¬ 
ground  knowledge  or  domain  theory  includes 
the  basic  representation  primitives  and  the  con- 
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4.2  Experiments 

We  have  tested  the  above  scheme  on  a  vari¬ 
ety  of  analogy  problems  in  order  to  sec  what 
rankings  the  criteria  would  give  to  various  pos¬ 
sible  solutions.  Limited  space  prevents  us  from 
giving  a  full  account  of  the  derivation  of  the  com¬ 
plexity  figures.  The  overall  method  is  as  follows. 
For  each  pair  (Problem;  Solution),  we  hypothe¬ 
size  associated  models  or  perceptions.  For  in¬ 
stance,  iijjkk  can  be  perceived  as  a  string  of  let¬ 
ters,  or  alternatively  as  a  sequence  of  successive 
pairs  of  letters.  Then,  a  program  computes  the 
algorithmic  complexity  of  these  constructs  and 
of  the  transformation  programs  that  allow  to 
derive  one  description  from  another.  The  asso¬ 
ciated  figures  are  reported  in  table  2. 

Problem:  abc  =>  abd  ;  lyjlck  *=>  ? 

Solutions ; 

SI :  ‘‘Replace  rightmost  group  of  letters  by 
itssuccessoriUJkk=>iyjl] 

S2  :  “Replace  rightmost  letter  by  its  suc¬ 
cessor  iUJkk  =>  lyjkl 

S3 :  “Replace  rightmost  letter  by  D“  iyjkk 
so  tyjkd 

S4 :  “Replace  third  letter  by  its  successor^* 
lUJkk  so  ifkjkk 

S5  :  “Replace  Cs  by  Ds“  lUJkk  =o  lUJkk 

S6 :  “Replace  rightmost  group  of  letters  by 
D“  lUJkk  s:>iUJd 

5.  CONCLUSION  AND  PERSPECTIVES 

These  experiments  and  calculations,  can¬ 
not  and  do  not  pretend  to  be  conclusive.  They 
rely  on  many  hunches  and  simplifications  that 
would  need  to  be  more  carefully  set.  Indeed,  it 
is  natural  that  such  be  the  case,  since  this  proves 
by  the  same  token  that  our  model  nicely  incor¬ 
porate  contextual  effects  and  the  possibility  of 
learning  (concepts  and  associations),  and  of  the 
consequences  these  may  have  on  analogy  mak¬ 
ing.  Still,  these  results  show  that  the  proposed 
scheme  does  not  seem  entirely  unreasonable 
from  the  point  of  view  of  a  comparison  with 
natural  cognition.  But  we  also  believe  that  most 
promising  is  the  fact  that  this  model  is  tightly 
linked  with  induction  theory.  Nonetheless,  it 
remains  unclear  why  a  high  degree  of  similari- 


SI 

S2 

S3 

S4 

S5 

S6 

UMS) 

10 

9 

II 

11 

12 

11 

UxlMsi 

,8 

18 

18 

18 

22 

15 

!j(f\Ms) 

4 

4 

3 

7 

8 

3 

UMfiMs) 

5 

0 

0 

0 

0 

17 

UxWj) 

8 

36 

36 

36 

42 

15 

Lcngfli  (bits 

35 

67 

68 

72 

85 

62 

Rank 

1 

3 

4 

4 

6 

2 

TthUJ  The  figures  eorretpoH^g  to  the  efaluation 
formula  are  reported  for  forlous  sotuHons  to  the 
prohiem  eoraidered.  Sotutlo/i  I  emerges  as  a  clear 
winner^  which  b  oho  the  choice  of  most  human  subjects 
when  asked  to  rank  these  solutions. 

ty,  or  the  possibility  of  a  simple  interpretation 
of  the  analogs  lends  credit  to  the  analogical  in¬ 
ference.  This  is  a  question  we  actively  study. 

Else,  one  of  our  current  research  project 
is  to  better  ground  our  calculations  on  the  the¬ 
ory  of  algorithmic  complexity,  to  maintain 
close  links  with  inductive  theory,  while  at  the 
same  time  experimenting  with  many  more 
examples  from  a  variety  of  domains.  We  also 
study  how  mechanisms  for  the  actual  produc¬ 
tion  of  analogies  (not  only  for  evaluation) 
could  be  derived  from  this  perspective. 
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INTRODUCTION 

How  is  reasoning  by  analogy  justified? 
Why  can  we  map  non-identical  elements  in  the 
source  and  target  analogs?  Is  it  valid  to  transfer 
some  elements  in  one  analog  to  another?  Al¬ 
though  the  problem  of  justification  is  of  criti¬ 
cal  importance  for  the  research  On  analogy,  only 
a  few  studies  have  discussed  them  seriously 
(Gentner,  1983;  Indurkhya,  1992).  The  aim  of 
this  paper  is  the  developing  of  a  new  frame¬ 
work  that  provides  an  answer  to  the  justifica¬ 
tion  problem. 

In  the  real  world,  analogical  reasoning  is 
widely  used  and  has  strong  power  in  various 
kinds  of  human  activities,  such  as  problem¬ 
solving,  learning,  discovzotherapy,  literature, 
myth,  political  and  legal  argument  (Holyoak  & 
Thagard,  1995).  It  is  a  powerful  tool  for  pro¬ 
viding  a  solution,  creating  a  new  idea,  arguing 
against,  and  persuading  opponents,  making 
ideas  more  explicit  and  impressive. 

However,  some  blame  analogy  and  claim 
not  to  use  it,  because  analogy  is  known  to  be  a 
dangerous  mode  of  reasoning.  Analogical  rea¬ 
soning,  like  induction,  does  not  have  logical  va¬ 
lidity.  Actually,  there  are  abundant  examples 
of  misuse  of  analogies  in  various  kinds  of  hu¬ 
man  activities,  such  as  education,  science,  po¬ 
litical  arguments,  commercial  advertisement 
(Gentner  &  Jeziorski,  1993;  Holyoak  & 
Thagard,  1995;  Indurkhya,  1992).  More  de¬ 
pressing  findings  were  obtained  by  Chi  and  her 
colleagues  (Chi  et  al.,  1989).  Their  study  found 
poor  learners’  excessive  reliance  on  analogy. 
These  learners  frequently  looked  back  to  pre¬ 
vious  problems,  read  them  extensively,  tried  to 


map  them  to  the  current  problem,  which  result¬ 
ed  in  poor  performance  on  transfer  tasks. 

What  was  mentioned  above  shows  two 
opposing  pictures.  Analogy  enriches  human 
cognition  and  gives  new  insights  in  some  cas¬ 
es.  In  other  cases,  analogy  obscures  our  ratio¬ 
nality  and  falls  into  poor  leamersO  desperate 
heuristics. 

The  purpose  of  the  paper  is  to  develop  a 
new  framework  to  give  explanations  of  how 
reasbning  by  analogy  is  justified,  and  in  what 
condition  analogies  are,  at  least,  psychological¬ 
ly  valid. 

In  the  next  section,  I  analyze  the  conditions 
for  justified  analogies.  According  to  the  analy¬ 
sis,  analogy  should  be  treated  as  a  kind  of  cate¬ 
gorization.  This  means  that  analogy  is  a  terna¬ 
ry  relation  between  the  base,  target,  and  their 
superordinate  category  (abstraction),  rather  than 
a  binary  relation  between  the  base  and  target. 
Second,  I  will  show  that  this  formulation  greatly 
reduces  the  computational  corhplexities  in  re¬ 
trieval  and  mapping.  Third,  I  will  try  to  figure 
out  the  characteristics  of  categories  by  the  find¬ 
ings  obtained  from  an  informal  observation. 
Finally,  I  will  reexamine  the  relationships  of 
analogy  to  other  kinds  of  cognitive  activities, 
based  on  the  proposed  framework’ 

JUSTIFICATION 
Identic  ality 

Although  controversies  are  still  continuing 
about  many  aspects  of  analogy,  there  is  one  ba¬ 
sic  assumption  that  few  deny.  This  assumption 
is  that  analogy  involves  mapping  from  the  base 
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to  the  target.  A  set  of  elements  in  the  base  is  cor¬ 
responded  to  a  set  of  elements  in  the  target.  An¬ 
other  set  of  elements  in  the  base  are,  then,  trans¬ 
ferred  to  the  target  to  create  new  inferences. 

Here,  one  can  ask  why  some  elements  in 
the  base  can  be  mapped  and  transferred  to  the 
target.  What  enables  mapping  and  transfer  be¬ 
tween  the  base  and  target?  It  is  a  difficult  prob¬ 
lem,  but  Leibnitz  gave  a  partial  an.swer  to  this 
problem.  According  to  his  principle  of  the  iden¬ 
tity  of  the  indiscernibles,  if  two  things  are  iden¬ 
tical,  any  of  their  predicates  can  be  transferred. 
This  principle  suggests  that  mapping  requires 
identicality  between  the  base  and  target. 

However,  as  long  as  analogy  is  concerned, 
this  principle  is  too  rigorous  to  be  applied  ,  be¬ 
cause  a  base  is  not  identical  to  a  target,  by  defini¬ 
tion.  The  ba.se  is  represented  qualitatively  differ¬ 
ent  from  the  target.  They  are  in  no  way  identical. 

Categorization 

Thus,  it  is  necessary  to  find  a  cognitive 
mechanism  that  makes  two  different  things  iden¬ 
tical  in  some  respects.  Logically,  it  is  impossi¬ 
ble,  but  there  is  one  psychological  mechanism 
that  can  do  it  approximately.  It  is  categorization. 
If  two  things  belong  to  the  same  category,  they 
are  properly  said  to  be  identical  in  terms  of  the 
category.  Suppose,  for  example,  that  there  are 
two  cats  that  differ  in  their  size,  color,  etc.  De¬ 
spite  of  these  differences,  they  are  identical  with 
respect  to  their  6catness.6  If  two  cats  belong  to 
the  same  category,  attributes  and  predicates  im¬ 
portant  with  respect  to  the  category  can  be 
mapped  from  one  cat  to  the  other. 

The  same  argument  can  be  applied  to  the 
theory  of  analogy.  If  there  is  a  superordinate 
category  whose  members  are  the  base  and  tar¬ 
get,  they  are  properly  said  to  be  identical,  with 
respect  to  the  category.  Thus,  properties  and 
relations  shared  by  the  base  and  the  category 
can  be  transferred  to  the  target. 

The  discussion  so  far  leads  us  to  change 
the  basic  framework  of  analogy.  As  I  said  ear¬ 
lier,  analogy  has  been  considered  to  be  a  bina¬ 
ry  relation  between  the  base  and  the  target. 
However,  if  the  argument  above  is  correct,  it 
follows  that  we  should  consider  analogy  as  a 


ternary  relation  between  the  base,  target  and 
their  category.  But,  the  term  "category"  usual¬ 
ly  refers  to  a  preexisting  taxonomic  category, 
such  as  animal,  plant,  dog  etc.,  so  I  introduce  a 
more  neutral  term,  ahstraction,  here.  Note  that 
the  term  abstraction  here  refers  to  an  abstract¬ 
ed  mental  entity,  not  to  the  action  to  abstract. 

Attempts  have  been  made  to  incorporate 
abstractions  to  the  theory  of  analogy.  In  artifi¬ 
cial  intelligence  research,  several  models  of 
analogy  have  made  explicit  use  of  abstraction 
(Greiner,  1988;  Kedar-Cabelli,  1985;  Russell, 
1988).  Glucksbcrg  and  Keysar  (1990)  and  I^- 
koff  (1993)  assume  abstracted  mental  entities 
in  understanding  metaphorical  statements,  al¬ 
though  there  are  controversies  between  them. 
A  number  of  studies  on  transfer  of  learning  have 
shown  the  importance  of  abstractions  (sec,  for 
example,  Gick  &  Holyoak,  1983;  Gosw^ami  & 
Brown,  1989).  Thus,  the  framework  proposed 
here  is  not  a  new  one.  Rather,  my  attempt  should 
be  considered  as  a  synthesizing  one. 

COMPUTATIONAL  CONSTRAINTS 

By  introducing  the  notion  of  ab.stractlon, 
we  obtain  a  couple  of  constraints  that  greatly 
reduce  the  computational  complexities  in  re¬ 
trieval  and  mapping. 

Retrieval 

An  important  consequence  of  introducing 
the  ab.straction  is  that  the  analog  retrieval  mech¬ 
anism  can  make  use  of  hierarchy.  Not  a  few 
researchers  admit  that  our  long-term  memory 
is  represented  hierarchically  from  most  concrete 
to  most  abstract  ones. 

If  the  retrieval  mechanism  makes  use  of  the 
information  about  the  hierarchy,  the  cost  of  re¬ 
trieval  is  obviously  reduced.  For  example,  if  an 
abstraction  is  judged  to  be  irrelevant  in  the  pro¬ 
cess  of  categorization,  an  analogizer  needs  not 
consider  all  of  its  descendents.  Theories  ignor¬ 
ing  the  hierarchical  information  have  to  te.st 
every  subcategory  even  after  its  ancestral  ab¬ 
straction  is  rejected. 

There  is  another  benefit.  The  more  ascend¬ 
ing  a  hierarchical  tree,  the  less  information  is 
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available.  Consequently,  one  may  sometimes 
descend  the  tree  to  obtain  further  information. 
In  this  case,  the  hierarchical  structure  constrains 
further  search.  If  you  select  an  abstraction  at 
some  level  and  want  to  get  more  information, 
you  need  not  search  the  entire  space.  Instead,  it 
is  sufficient  to  search  items  that  are  descendents 
of  the  abstraction  previously  selected. 

One  of  the  problerns  to  be  considered  here 
is  whether  concrete  base  analogs  are  hierar¬ 
chically  organized.  Many  agree  with  the  hier¬ 
archical  organization  of  the  common  natural 
kinds,  but  how  about  knowledge  structures 
used  in  analogy? 

Memory  organization  is  found  in  more 
complex  materials  such  as  stories  arid  episodes. 
Although  there  are  'controversies,  some  re¬ 
searchers  showed  that  the  story  grammar  type 
of  knowledge  structure  constrains  encoding  and 
retrieval  of  stories. 

The  second  line  of  evidence  comes  froifi 
Shank’s  Mops  and  TOPs  type  of  knowledge 
organization.  Reflecting  Black,  Bower,  &  Turn¬ 
er's  experiment,  Schank  elaborated  his  theory 
of  scripts  to  include  more  abstract  knowledge 
structures.  According  to  him,  there  are  knowl¬ 


edge  structures  that  hierarchically  organize  con¬ 
crete  representations  of  specific  events.  They 
are  called,  MOPs,  metaMOPs,  universal  MOPS. 
In  addition,  he  assumed  a  different  kind  of  struc¬ 
tures  that  organize  thematically  similar  events, 
and  he  called  it  TOPs  (thematic  organization 
packets).  He  believed  that  these  best  explain 
cross-contextual  reminding. 

The  third  line  of  evidence  comes  from 
Fukuda's  work.  In  his  experiments,  subjects'  re¬ 
minding  was  greatly  improved  when  they  were 
given  cues  at  the  moderately  abstract  level, 
compared  with  when  given  very  similar  stories 
as  cues.  The  superiority  of  such  a  cue  strongly 
supports  the  idea  that  there  exist  abstractions 
and  that  concrete  episodes  are  organized  around 
the  abstraction. 

Mapping 

In  the  mapping  process,  abstractions  make 
two  contributions,  both  of  which  reduce  the 
computational  costs  irivolved  in  mapping.  The 
first  one  is  concerned  with  the  selection  of  can¬ 
didate  elements  to  be  mapped.  Suppose  that  a 
base  has  n  elements.  The  number  of  the  candi¬ 
date  sets  to  be  mapped  amounts  to  2"  -1.  This 
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obviously  causes  combinatorial  explosion  if  n 
is  getting  larger. 

However,  if  an  abstraction  is  involved  in 
mapping,  one  need  not  suffer  from  it.  It  is  be¬ 
cause  what  is  true  for  the  abstraction  must  be 
true  for  its  subordinate  target.  It  follows  that  ev¬ 
ery  element  in  the  abstraction  can  be,  and  should 
be,  mapped.  Although  one  still  has  to  decide 
which  clement  in  the  base  correspond  to  which 
clement  in  the  target,  the  computational  gain  in 
the  selecrion  of  a  candidate  set  is  very  large. 

The  second  benefit  is  concerned  with  the 
number  of  elements  in  the  abstraction.  The 
number  of  elements  in  an  abstraction  is,  by  def¬ 
inition,  smaller  than  that  of  its  subordinate, 
concrete  base  analogs.  It  is  impossible  to  make 
a  general  estimation  of  how  much  smaller  the 
elements  in  the  abstraction  is,  but  the  reduc¬ 
tion  in  the  number  of  elements  produces  huge 
computational  gain  in  many  cases. 

For  example,  if  one  maps  n  elements  of  the 
base  to  the  target,  the  resulting  number  of  pos¬ 
sible  mappings  is  the  permutation  of  n,  shown 
by  the  thick  line  in  the  graph.  As  you  see,  it  is 
approximated  by  an  exponential  function.  Sup¬ 
pose  that  an  abstraction  has  a  half  of  the  ele¬ 
ments,  The  number  of  candidate  hypotheses  is 
depicted  by  the  broken  line  (the  dotted  line 
shows  the  number  of  hypotheses  when  the  num¬ 
ber  of  the  base  element  is  reduced  to  a  quarter 
of  n).  Although  the  number  of  possible  map¬ 
ping  hypotheses  Is  exponential  even  assuming 
the  abstraction,  the  computational  gain  is  huge 
compared  with  the  cases  without  abstractions. 

ABSTRACTIONS  IN  ANALOGICAL 
REASONING  ABOUT  ELECTRICAL 
CIRCUIT 

Informal  observation 

This  is  the  stage  for  the  present  framework 
to  be  more  concrete.  My  favorite  example  is 
people’s  natural  reasoning  about  the  electric  cir¬ 
cuit.  As  Gentner  &  Centner  (1983)  reported 
what  type  of  base  analog  is  used  affects  sub¬ 
jects  pr^iction  about  the  behaviors  of  the  cir¬ 
cuit.  When  a  water  flow  system  was  introduced, 


subjects  correctly  infer  the  change  of  the  elec¬ 
tricity  when  a  battery  is  added  serially.  On  the 
other  hand,  subjects  prediction  improved  in  the 
case  of  parallel  resistance,  when  a  teaming 
crowd  analogy  was  taught. 

In  the  experiment,  they  gave  subjects  either 
analog  explicitly,  and  ask^  them  to  use  it  when 
answering  the  problems.  However,  people  can 
draw  analogies  spontaneously  even  without  such 
instruction.  From  my  observation,  most  univer¬ 
sity  students  used  liquid  flow  analogies  initial¬ 
ly,  although  they  were  not  exactly  the  same  as 
the  water  flow,  as  I  will  show  you  later. 

Such  naturally  drawn  analogies  tell  us  many 
things.  First  of  all,  although  most  subjects  used 
a  kind  of  liquid-flow  analogy,  it  is  very  dubious 
that  their  analogies  were  based  on  a  specific 
experience  about  a  water  flow  system.  It  is  hard 
to  imagine  that  they  had  seen  water  flowing  in 
the  closed  circuit  with  a  pump,  even  harder  to 
imagine  they  had  seen  a  parallel  circuit  with  two 
pumps  attached  serially!  If  they  did  not  have  any 
experience  with  it,  how  could  they  make  analo¬ 
gies?  This  shows  that  in  naturally  drawn  analo¬ 
gies,  the  possibility  of  making  use  of  very  con¬ 
crete,  episode  type  of  base  analog  is  very  low. 

Second,  the  mapping  was  very  immediate, 
so  immediate  that  they  seemed  not  to  be  in  trou¬ 
ble  with  candidate  mapping  hypotheses.  From 
my  observation,  no  single  ca.se  was  found  that 
they  made  mistake  in  finding  correspondence. 
Essential  parts  in  the  base  and  target  were  im¬ 
mediately  mapped,  while  non-essential  parts 
seemed  not  to  be  even  for  a  slightest  consider¬ 
ation.  The  protocol  shows  no  statements  such 
as  pump's  having  a  lever  or  a  switch,  pumps 
needs  of  external  forces,  although  they  play 
causal  roles  in  the  actual  water  flow  system. 

Third,  we  observed  the  on-line  con.struc- 
tion  of  a  base.  When  subjects  were  asked  to 
estimate  heating  values  at  resistance,  many  sub¬ 
jects  spontaneously  and  naturally  switched  the 
source  analog  from  the  liquid-flow  to  the  parti¬ 
cle-flow.  That  is,  they  changed  the  flowing  en¬ 
tity  from  liquid  to  something  solid,  such  as  peo¬ 
ple,  small  stones,  or  particles.  The  shift  seems 
to  be  done  because  water  was  judged  not  to  be 
a  relevant  analog  for  the  generation  of  heat. 
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These  solid  objects,  instead,  enable  people  to 
naturally  infer  the  generation  of  heat  by  the  fric¬ 
tion  of  contacting  parts. 

Flowing  system  abstraction 

The  picture  drawn  by  the  observation  is 
quite  different  from  the  ones  that  the  current 
models  of  analogy  do.  Despite  of  the  unavail¬ 
ability  of  concrete  base  analogs,  people  had  lit¬ 
tle  difficulties  in  reasoning  analogically  about 
the  behaviors  of  the  electric  circuit.  This  fact 
suggests  that  an  abstraction,  a  flowing  system, 
is  responsible  for  subjects'  analogical  reason¬ 
ing.  T^is  abstraction  is  veiy  simple  in  the  sense 
that  it  consists  of  only  three  components:  a 
flowing  entity,  path,  and  force.  A  typical  rela¬ 
tion  between  them  is  that  the  force  causes  the 
entity  to  flow  through  the  path. 

'The  simplicity  of  the  abstraction  partly 
explains  the  immediate  mapping.  Since  there 
are  only  three  components  that  are  distinct,  and 
every  component  of  the  abstraction  is  applied 
by  definition,  there  are  little  possibilities  for 
misunderstanding  the  mapping  relations. 

The  flowing  system  abstraction  is  a  higher- 
order  abstraction,  in  the  sense  that  every  compo¬ 
nent  is  variable.  Thus,  it  must  be  supplemented 
and  enriched  by  contextual  information  involved 
in  the  problem  situation,  when  it  is  actually  used. 
This  enables  abstractions  to  be  flexible.  Even 
when  people  cannot  access  to  a  concrete  base 
analog,  they  can  naturally  make  useful  inferenc¬ 
es,  by  instantiating  the  abstractions  under  the  con¬ 
straints  posed  by  the  problem  situation. 

Furthermore,  these  characteristics  explain 
the  on-line  construction  of  a  new  base  analog. 
As  I  reported  earlier,  subjects  could  easily  shift 
from  the  liquid-flow  to  the  particle  flow  ana¬ 
log  by  changing  the  flowing  entity  when  they 
dealt  with  the  generation  of  heat.  The  ease  of 
the  shift  cannot  be  explained,  without  assum¬ 
ing  the  flowing  system  abstraction.  If  people 
had  used  an  actual  water  flow  system  as  a  base 
analog,  the  shift  should  not  have  been  done  so 
easily.  This  is  because  people  have  to  retrieve 
a  new  analog  by  examining  all  the  candidate 
analogs  again,  and  they  have  to  replace  every 
component  of  the  analog  with  new  ones:  a  wa¬ 


ter  pump  with  a  loud  speaker,  a  pipe  with  a  road, 
etc.  However,  if  one  assumes  the  abstraction, 
the  search  for  a  new  analog  is  constrained.  Fur¬ 
thermore,  it  is  enough  to  change  one  of  the  com¬ 
ponents  of  the  abstraction,  because  the  exist¬ 
ence  of  the  pushing  force  and  the  path  is  guar¬ 
anteed  by  the  abstraction. 

In  addition,  the  flowing  system  abstraction 
provides  the  global  coherence  when  changing 
the  analogs.  If  one  uses  a  completely  new  ana¬ 
log,  there  is  a  possibility  of  inconsistency  be¬ 
tween  what  have  been  inferred  and  what  will 
be  inferred.  On  the  other  hand,  inferences  based 
on  old  and  new  analogs  are  consistent  if  they 
are  descendents  of  the  same  abstraction.  In  the 
case  of  the  electric  circuit  analogy,  inferences 
based  on  the  liquid  flow  analog  are  guaranteed 
to  be  consistent  with  those  based  on  the  parti¬ 
cle  flow  analog. 

Some  researchers  have  emphasized  the  pro¬ 
cess  of  adaptation  in  analogy.  Since  there  are 
few  problems  that  a  base  analog  can  directly 
be  applied,  it  is  often  necessary  to  adapt  ana¬ 
logs  to  the  current  problem  situation.  For  a  flex¬ 
ible  adaptation,  it  would  be  better  source  ana¬ 
logs  to  be  small  and  simple,  like  the  flowing 
system  abstraction.  It  is  difficult  to  modify  and 
adapt  big,  deep,  and  complex  analogs  that  con¬ 
tains  a  lot  of  information. 

Contrasting  abstraction-based  view  with 
current  theories  of  analogy 

The  framework  proposed  here  contrasts 
sharply  with  that  of  the  dominant  theories  of 
analogy.  According  to  the  dominant  view,  epi¬ 
sodes  are  represented  almost  literally  in  the 
form  of  first-order  predicate  logic.  Since  no 
abstraction  or  summarization  is  assumed  to  take 
place  in  encoding  source  episodes,  each  source 
episode  forms  a  large,  deep,  complex  structure. 
In  addition,  each  analog  is  stored  in  a  relatively 
isolated  fashion.  Thus,  some  assume  only  sur¬ 
face  level  matches  (Forbus  et  al.,  1995),  while 
others  can  only  make  use  of  word-to-word  lev¬ 
el  relations  (Thagard  et  al.,  1990).  In  mapping, 
many  theories  share  the  assumption  that  initial 
mapping  is  carried  out  syntactically.  Since  this 
type  of  mapping  generates  a  large  number  of 
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mapping  hypotheses,  one  or  more  constraints 
are  called  for  to  reduce  thefn  (Falkenhainer  et 
al.,  1989;  Holyoak  &  Thagard,  1989).  The 
mapped  structure  is  static  and  isolated  in  the 
sense  that  it  is  prestored  in  the  source  analog 
and  has  few  relations  to  other  analogs.  Thus, 
when  shifting  a  source,  an  analog! zer  has  to 
reiterate  the  entire  processes. 

On  the  other  hand,  the  abstraction -based 
view  of  analogy  assumes  small,  simple  abstract¬ 
ed  mental  entities  as  source  analogs.  A  small 
number  of  variabilized  components  are  in¬ 
volved  in  abstractions.  Each  source  abstraction 
is  connected  to  form  a  hierarchy.  In  mapping, 
variable  bindings  or  unification  take  place,  in  a 
deductive  fashion.  Since  an  abstraction  involves 
a  small  number  of  distinct  elements,  the  num¬ 
ber  of  possible  mapping  hypotheses  is  small. 
The  resulting  structure  is  liable  to  modification 
under  the  constraints  posed  by  the  target  ana¬ 
log  and  task  goal.  In  this  sense,  analogy  by  ab¬ 
straction  is  dynamic  and  constructive. 

The  findings  obtained  from  the  informal 
observation  of  people's  spontaneous  analogi¬ 
cal  reasoning  are  not  compatible  with  the  dom¬ 
inant  view.  First,  there  seem  to  be  no  large,  com¬ 
plex  source  analogs  available.  Second,  people 
retrieved  the  source  analog  very  rapidly.  It 
seemed  that  only  a  limited  number  of  candi¬ 
date  analogs  were  in  consideration.  This  sug¬ 
gests  that  subjects  may  make  use  of  the  hierar¬ 
chical  information  in  retrieval.  Third,  mapping 
was  rapidly  carried  out  without  mistakes.  This 
suggests  that  they  did  not  suffer  from  a  large 
number  of  mapping  hypotheses,  which  in  turn 
leads  us  to  the  idea  that  a  source  analog  actual¬ 
ly  used  did  not  have  a  large,  complex  structure. 
Finally,  subjects  shifted  from  one  source  to 
another  flexibly  and  naturally,  by  changing  a 
part  of  the  source  analog.  It  would  have  taken 
relatively  long  time  if  they  had  replaced  the 
original  analog  with  a  completely  new  one.  This 
indicates  that  they  did  not  use  an  analog  repre¬ 
senting  the  actual  water  flow  system. 

These  findings  are  best  explained  by  the 
abstraction-based  view  of  analogy,  which  as¬ 
sumes  small,  simple  variabilized  mental  enti¬ 
ties  connected  hierarchically. 


RELATIONS  TO  OTHER  KINDS  OF 
COGNITION 

A  number  of  researchers  have  explored  the 
processes  and  structures  of  analogy  for  many 
years.  They  have  revealed  what  subprocesses 
are  involved  in  analogical  reasoning,  what  af¬ 
fects  human  analogy  making,  as  well  as  how 
and  where  analogies  are  used.  These  findings 
lead  to  computational  theories  of  analogy.  By 
their  competition,  the  levels  of  analysis  have 
been  greatly  improved,  which  in  turn  leads  to 
greater  sophistication  of  the  theories. 

However,  the  relationships  of  analogy  to 
other  kinds  of  cognition  have  been  missed  in 
the  course  of  the  scientific  endeavor.  Analogy 
plays  a  central  role  in  human  cognition,  but  it 
seems  strange  that  there  is  a  cognitive  engine 
designed  specifically  for  making  analogies.  It 
might  be  that  analogy  is  a  special  combination 
of  more  basic  cognitive  components.  If  so,  we 
should  explore  the  relationships  of  analogy  to 
other  kinds  of  cognition. 

The  proposed  framework  opens  the  door 
of  analogy  to  other  kinds  of  cognition.  In  this 
section,  I  briefly  review  the  relationship  of  anal¬ 
ogy  to  categorization  and  deduction. 

Categorization 

One  important  relation  is  to  categorization. 
Although  abstractions  accessed  in  the  course 
of  analogical  reasoning  are  different  from  com¬ 
mon  categories,  the  underlying  mechanisms  arc 
the  same.  Both  assume  the  hierarchical  stme- 
lure  and  the  inheritance  of  properties. 

Certainly,  dominant  models  of  categoriza¬ 
tion  seem  to  be  a  little  bit  too  simplistic,  be¬ 
cause  they  do  not  have  principled  methods  dis¬ 
tinguishing  structural  and  surface  information. 
Thus,  as  Ramscar  and  Pain  (1996)  pointed  out, 
the  model  of  categorization  should  be  modi¬ 
fied  and  enriched  by  the  findings  obtained  from 
analogy  research. 

Deduction 

A  striking  finding  provided  by  the  frame¬ 
work  is  that  analogy  is  similar  to  deduction.  My 
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proposal  is  the  following:  given  a  target  is  a 
member  of  an  abstraction,  and  that  abstraction 
has  a  property  X,  then  the  target  has  the  prop¬ 
erty  X.  This  form  of  reasoning  is  properly  said 
to  be  a  categorical  syllogism.  We  explain  why 
some  analogies  seem  to  be  psychologically  val¬ 
id.  This  is  because  they  are  deduction. 

However,  I  do  not  intend  to  reduce  analo¬ 
gy  to  deduction.  My  position  is  the  opposite. 
From  my  viewpoint,  deduction  is  a  kind  of  anal¬ 
ogy.  Abstractions  used  in  the  processes  of  anal¬ 
ogy  do  not  have  the  same  status  as  premises  in 
deduction.  People  may  induce  a  wrong  abstrac¬ 
tion  in  some  cases,  while  they  may  access  to  a 
wrong  abstraction  in  other  cases.  Thus,  there 
exists  uncertainty  in  analogical  reasoning.  On 
the  other  hand,  categories  appeared  in  deduc¬ 
tion  are  fixed,  and  proved  to  be  relevant.  Thus, 
no  ambiguity  is  found  in  deduction.  If  you  ad¬ 
mit  the  discussion  above,  you  will  notice  that 
deduction  is  a  special  case  of  analogy,  not  vice 
versa. 
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ABSTRACT 

An  object  can  be  categorized  in  a  rather 
aibitrary  number  of  ways.  A  shoe  can  be  cate¬ 
gorized  as  a  Nike,  model  X,  size  44  (by  the  per¬ 
son  making  an  inventory  of  a  sports  shop),  as  a 
Nike  (by  the  customer),  as  a  sports  shoe  (by 
sometKxIy  who  intends  to  go  jogging),  as  a  shoe 
(by  somedy  who  looks  for  shoes),  as  something 
us^  to  go  from  place  to  place,  as  a  covering 
that  comes  in  direct  contact  with  the  terrain, 
etc.  A  tire  can  be  categorized  as  a  snow  tire,  as 
a  tire,  as  something  used  to  go  from  place  to 
place,  as  a  covering  that  comes  in  direct  con¬ 
tact  with  the  terrain,  etc.  What  happens  when 
drawing  an  analogy  between  a  tire  and  a  shoe 
(example  from  Gentner,  1988)?  We  will  argue 
that,  in  many  cases,  analogy  can  be  viewed  as  a 
categorization  process  within  a  network  of  cat¬ 
egories.  It  might  be  a  straightforward  categori¬ 
zation,  in  which  case  the  first  category  selected 
(source)  is  relevant  for  drawing  the  analogy.  In 
this  case,  it  could  be  debated  whether  this  pro¬ 
cess  should  be  called  analogy  or  categorization. 
However,  many  cases  considered  as  analogy  in 
the  literature  can  be  plausibly  described  as  sit¬ 
uations  of  categorization.  It  might  also  require 
an  upward  search  in  a  network  of  categories. 
The  difficulty  of  an  analogy  can  then  be  due  to 
the  difficulty  of  this  search,  either  due  to  the 
number  of  steps  required  to  categorize  at  the 
appropriate  abstract  level  or  to  the  difficulty  to 
access  that  level. 

In  this  paper  we  will  first  describe  the  se¬ 
mantic  network  within  which  we  assume  that 
the  analogy  is  drawn.  We  will  then  discuss  how 


some  analogies  can  be  seen  as  categorizations 
and  how  other  analogies  can  be  seen  as  involv¬ 
ing  an  abstraction  process. 

A  SEMANTIC  NETWORK  OF 

CATEGORIES,  MEDIUM  OF  THE 
ANALOGY 

Before  intoducing  the  semantic  network, 
we  present  some  results  and  conceptions  about 
categorization  that  we  will  rely  on. 

Some  results  and  conceptions  about 
categorization 

There  is  a  tendency  to  categorize  objects  at 
their  basic  level  (Rosch,  1978).  However  this 
categorization  is  not  systematic.  It  is  influenced 
by  the  context  and  in  particular  by  the  goal  of 
the  categorizer  (for  instance  an  apple  can  be 
seen  as  a  fruit  but  also  as  a  thing  to  take  for  a 
picnic,  Barsalou,  1991),  and  influenced  by  her 
or  his  expertize  in  the  field  (experts  in  birds 
will  not  categorize  them  at  the  same  level  than 
lay  people,  Tanaka  &  Taylor,  1991). 

The  inclusion  relation  is  a  key  relation  with¬ 
in  categories  (Smith  &  Medin,  1981).  Yet,  a 
network  of  categories  can  not  be  seen  as  a  tree 
(a  taxonomy).  As  Richard  and  Tijus  ( 1 998)  sug¬ 
gested,  a  tomato  can  be  seen  as  a  fruit  or  as  a 
vegetable  depending  whether  the  context  is  pre¬ 
paring  a  dinner  or  a  course  of  biology.  The  struc¬ 
ture  which  organizes  categories  is  more  proba¬ 
bly  a  kind  of  hierarchy  in  which  a  subordinate 
can  have  several  superordinates,  that  is  a  lat¬ 
tice  (Poitrenaud,  1995). 
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Categories  do  not  only  apply  to  group  of 
objects  and  a  word  of  the  language  is  not  need¬ 
ed  to  designate  a  category.  There  exist  natural 
and  artifactual  categories  (Rosch,  1978,  c.g., 
birds  and  T.V.),  categories  of  scenes  and  envi¬ 
ronments  (Tversky  &  Hemenway,  1983,  e.g., 
fancy  restaurant),  categories  of  events  (Morris 
&  Murphy,  1990,  c.g.,  food  shopping),  goal 
oriented  categories,  (e.g.:  ‘things  to  take  on  a 
camping  trip*)  which  can  be  ad  hoc  categories, 
that  is  built  for  the  need  of  a  task  but  can  also 
become  well  established  in  long  term  memory 
(Barsalou,  1991),  For  instance  ‘thing  that  is 
desired  but  can’t  be  obtained  and  hence  is  den¬ 
igrated*  (see  below)  or  ‘situations  in  which  an 
action  taken  to  remedy  a  problem  actually  de¬ 
feats  the  main  purpose  of  the  thing  affected  by 
the  problem*  (Mitchell,  1993)  might  be  con¬ 
sidered  as  categories. 

Categorization  is  often  viewed  as  a  classi¬ 
fication  tool  (Rosch,  1978).  However,  catego¬ 
rization  is  not  only  a  cognitive  economy  de¬ 
vice  allowing  people  not  to  deal  with  too  many 
different  objects  in  the  world.  Categorization 
is  used  to  infer  non-evident  properties  from 
evident  properties.  If  we  consider  that  an  ob¬ 
ject  belongs  to  a  category,  we  can  predict  more 
about  this  object  that  what  we  actually  observed 
(Anderson,  1991).  When  we  see  a  shoe,  we  can 
infer  that  it  can  be  used  for  walking,  that  it  will 
be  damaged  if  we  use  it  without  taking  care  of 
it,  that  it  might  have  been  made  by  children  in 
South-East  Asia. 

The  properties  of  an  object  belonging  to  a 
category  are  not  equally  easily  accessed.  While 
some  properties  are  context  independent,  oth¬ 
ers  are  activated  only  in  specific  contexts 
(Barsalou,  1982).  For  instance,  a  basketball  rolls 
is  activated  in  any  ca.se,  but  that  a  basketball 
floats  is  activated  in  specific  cases  like  basket¬ 
balls  charged  on  a  boat.  This  context  depen¬ 
dency  can  be  interpreted  as  an  access  to  a  su¬ 
perordinate  category  depending  on  the  context 
(for  instance,  depending  on  the  context,  a  bas¬ 
ket  ball  might  be  seen  as  belonging  to  the  cate¬ 
gory  of  floating  objects,  Sander,  1997). 

The  categories  can  have  a  complex  struc¬ 
ture.  The  views  that  categories  are  represent- 
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cd  by  a  list  of  features  (feature  models)  or  by 
some  instances  (exemplar  models)  have  been 
challenged.  Several  authors  consider  that  our 
representation  of  concepts  is  structured  in  a 
more  complex  way  and  might  include  relations 
between  features  or  with  other  concepts  (Mur¬ 
phy  &  Medin,  1985;  Wisniewski,  1995).  For 
instance,  in  our  reprc.sentation  of  the  concept 
car,  we  have  probably  information  about  the 
respective  roles  of  the  different  parts  of  a  car 
(the  wheels,  the  seats,  etc.)  and  about  how 
these  parts  interact. 

Description  of  the  semantic  network 

Once  we  face  a  new  situation  (a  target), 
we  claim  that  analogy  making  can  be  described 
as  a  search  and  property  attribution  mecha¬ 
nism,  which  operates  within  a  semantic  net¬ 
work  which  has  been  activated  by  this  target 
situation.  The  construction  of  this  semantic 
network  is  circumstantial  and  contextualized: 
it  is  done  within  the  context  of  a  task,  in  the 
same  way  as  the  construction  of  the  ad  hoc 
categories  (Barsalou,  1991).  The  semantic 
network  includes  semantic  and  functional 
knowledge  associated  not  only  with  the  ob¬ 
jects  present  in  the  situation,  but  also  with  the 
objects  and  categories  associated  to  the  situa¬ 
tion;  the  semantic  network,  medium  of  the 
analogy,  is  a  part  of  a  general  knowledge  net¬ 
work  seen  under  a  certain  point  of  view,  that 
of  the  task.  It  is  built  from  two  operations  of 
selection:  an  operation  depending  on  the  na¬ 
ture  of  the  objects  of  the  situation  and  an  op¬ 
eration  depending  on  the  task  and  on  its  con¬ 
straints.  As  the  selection  is  made  also  at  the 
level  of  the  goals,  this  implies  that  the  same 
device,  used  for  two  different  tasks,  might  not 
generate  the  same  semantic  network.  In  par¬ 
ticular,  this  leads  to  a  selection  among  the 
potential  superordinates.  A  computer,  used  as 
a  word  processor  will  evoke  the  typewriter, 
the  domain  of  writing,  and  the  domain  of  ma¬ 
nipulating  objects  (Sander  &  Richard,  1997). 
The  same  computer  used  to  deal  with  a  data 
base  evokes  (at  least  in  France)  the  well  spread 
device  known  by  the  name  of  Minitel  which 
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includes  a  keyboard  and  a  screen  and  is  used 
to  give  access,  via  the  telephone,  to  many  ser¬ 
vices.  Despite  the  fact  that  all  this  is  done 
through  the  use  of  the  keyboard,  other  domains 
are  evoked  (the  domain  of  the  telephone  and 
the  domain  of  communication,  Richard  &  Ti- 

jus,1998). 

Within  the  formalism  that  we  use  (PRO¬ 
COPE,  see  Poitrenaud,  1995),  categories,  con¬ 
sidered  from  the  point  of  view  of  the  inclusion 
relation,  are  the  nodes  of  a  network;  the  links 
between  the  nodes  represent  the  relation  «is  a 
kind  of».  From  the  point  of  view  of  the  part- 
whole  relation,  different  properties  can  be  as¬ 
sociated  with  each  part  (e.g.,  a  wheel  of  a  car 
has  some  properties  and  a  seat  has  other  prop¬ 
erties).  Properties  are  associated  with  a  catego¬ 
ry,  those  which  are  activated  when  an  object  is 
considered  as  belonging  to  the  category,  as 
Anderson,  1991  considers.  One  specificity  of 
this  formalism  (Poitrenaud,  1995)  is  that  goals 
that  can  be  achieved  on  an  object  are  consid¬ 
ered  as  properties  of  the  object  (for  instance  a 
property  of  a  shoe  is  that  it  can  be  used  for 
walking,  a  property  of  a  word,  considered  as  an 
object  of  a  text  editor,  is  that  it  can  be  moved). 

Once  such  a  semantic  network  has  been 
activated,  two  kind  of  analogies  may  be  distin¬ 
guished.  Those  which  can  be  seen  as  straight¬ 
forward  cases  of  categorization  and  those  in¬ 
volving  an  abstraction  process. 

,  ANALOGIES  THAT  CAN  BE  SEEN  AS 
STRAIGHTFORWARD 
CATEGORIZATIONS 

Analogy  as  a  categorization  process 

Several  investigators  have  already  claimed 
that  there  is  a  continuum  between  analogy  and 
categorization  (e.g.  Hofstadter,  1995;  Turner, 
1988).  Actually,  in  both  analogy  and  categoriza¬ 
tion,  a  known  situation  (the  categoiy  or  the  source) 
is  used  to  treat  a  new  situation  (the  object  to  be 
categorized  or  the  target)  as  if  it  were  familiar 
(Holyoak  &  Thagard,  1995;  Spalding  &  Murphy, 
1996).  In  both  cases,  one  of  the  main  purposes  is 
inferential:  the  knowledge  about  the  source  or  the 


categoiy  is  used  to  infer  features  of  the  target  (Ho¬ 
lyoak  &  Thagard,  1995;  Anderson,  1991). 

;  Analogy  is,  in  some  cases,  a  straightforward 
case  of  categorization  in  which  the  target  situ¬ 
ation  is  assimilated  to  a  reference  class,  which 
is  the  source.  Properties  common  to  the  source 
and  to  the  target  are  used  to  access  the  source, 
which  enables  properties  belonging  to  the  ref¬ 
erence  class  (the  source),  to  be  attributed  to  the 
new  situation.  The  basic  process  in  this  case  is 
the  search  for  a  relevant  source.  We  consider 
that  a  source  is  accessed  according  to  the  sa¬ 
lient  features  it  shares  with  the  target  (Vosnia- 
dou,  1989).  The  salient  features  are  those  which 
the  participant  accesses  in  the  new  situation, 
taking  into  account  her  or  his  knowledge  and 
the  context  (Kokinov  &  Yoveva,  1996),  and  that 
she  or  he  considers  as  relevant. 

,  In  our  study  of  learning  text  editing  (Sand¬ 
er  &  Richard,  1997),  we  have  shown  that,  in  a 
first  step,  the  typewriter  is  a  source  of  analogy 
for  the  participants.  We  found  that  all  the  pro¬ 
cedures  imported  from  the  typewriter  were  used 
by  all  the  participants  in  the  experiment  from 
the  beginning  of  a  learning  session,  whereas 
pnly  12%  of  the  procedures  which  were  not 
direct  adaptations  of  typewriter  procedures 
where  used  by  them.  The  analogy  can  be  de¬ 
scribed  this  way:  the  text  editor  is  categorized 
as  a  typewriter,  as  this  is  the  known  domain 
which  shares  the  greatest  number  of  salient  fea¬ 
tures.  The  general  goal  is  the  same:  to  type  a 
text,  and  objects  are  shared:  a  keyboard  and  a 
surface  on  which  what  is  typed  appears  (screen 
or  sheet  of  paper).  Knowledge  associated  to  the 
actions  that  can  be  performed  with  a  typewrit¬ 
er  is  described  in  a  schematic  way  by  the  net¬ 
work  of  the  Figure  1 . 


Figure  1.  Actions  that  can  be  performed  with  a 
typewriter. 
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One  interest  in  seeing  analogy  as  a  proper¬ 
ty  attribution  mechanism  through  categoriza¬ 
tion  is  that  it  does  not  imply  that  the  starting 
point  of  the  analogy  requires  a  complex  repre¬ 
sentation  of  the  target.  The  participant  might 
have  a  very  crude  representation  of  the  new  sit¬ 
uation  before  having  selected  a  source,  as  is 
probably  the  case  in  the  situation  of  learning 
how  to  use  a  text  editor,  as  well  as  in  many  oth¬ 
er  situations  of  analogy  in  which  the  new  situ¬ 
ation  is  really  unfamiliar.  This  issue  of  how  a 
first  representation  of  a  target  situation  is  built 
have  been  questioned  in  several  recent  worics 
(Bassok  &  Olseth,  1995;  Hofstadter,  1995;  Ross 
&  Bradshaw,  1994).  In  our  view  analogy  is  in¬ 
volved  in  the  first  encoding  of  the  new  situa¬ 
tion  because  the  category  selected  provides  both 
means  of  action  (such  as  typing  on  a  key)  and 
means  of  encoding  of  the  situation.  For  instance, 
if  the  task  is,  with  a  text  editor,  to  turn 
‘ana  logy’  into  ‘analogy’  and  if  the  participant 
has  no  knowledge  concerning  how  to  delete  a 
space  or  how  to  move  a  string  of  characters,  as 
it  is  the  case  if  the  typewriter  is  taken  as  a  source 
(Figure  1),  the  only  encoding  available  for  a 
true  novice  is  that  ’logy’  must  be  deleted  and 
written  again  after  ‘ana’.  If  one  knows  how  to 
move  a  string  of  characters,  she  or  he  can  code 
the  task  as:  the  part  of  the  words  have  to  be  put 
closer;  which  leads  to  the  cut-and-paste  proce¬ 
dure.  If  the  participant  knows  that  the  space  can 
be  deleted,  for  instance  knowing  already  that 
the  space  is  a  kind  of  character,  the  situation 
can  be  coded  by  deleting  the  space  and  using 
an  associated  procedure  like  dragging  then 
clearing,  or  using  the  backspace  key.  The  last 
two  codings  imply  (Sander  &  Richard,  1997) 
that  other  than  the  typewriter  sources  have  been 
selected.  In  these  cases,  the  way  the  situation  is 
coded  depends  on  the  source  which  has  been 
selected  and  thus  can  not  be  seen  as  the  entry 
to  the  analogy  mechanism. 

Application  to  classical  situations  of  analogy 
making 

Centner’s  example  of  the  electric  battery  as 
like  a  reservoir  (Centner,  1983)  is  relevant  in  this 
context.  It  can  be  argued  that  a  reservoir  is  de- 
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Figun  2,  Analogy  between  an  etectrk  battery  and  a 
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fined  in  first  place  by  its  functional  property,  that 
is  by  the  goal  that  it  permits  to  achieve.  This  prop¬ 
erty  is  to  permit  storage  for  a  differed  utilisa¬ 
tion.  Thus,  a  reservoir  is  a  member  of  the  cate¬ 
gory  of  objects  that  permit  storage  fora  differed 
utilisation.  Following  Clucksbcrg  &  Keysar 
(1990),  it  can  be  considered  that  the  name  of  the 
category  is  one  of  a  typical  member,  in  which 
case  an  electric  battery  is  an  instance  and  reser¬ 
voir  is  the  name  of  the  category  (Figure  2).  Elec¬ 
tric  batteiy  is  considered  as  a  member  of  this 
category  and  the  property  permitting  storage  for 
a  differ^  utilisation  is  attributed  to  it.  Thus,  it 
can  be  said  that  an  electric  battery  is  a  reservoir. 
If  we  call  reservoirl ,  a  reservoir  made  of  metal 
and  hollow,  which  permits  to  store  liquid,  and 
ieservoir2  the  category  of  the  objects  which  per¬ 
mit  storage  for  a  differed  utilisation,  an  electric 
battery  is  a  rescrvoii2. 

Analogy  between  Aesop’s  «sour  grapes» 
fable  and  Harry’s  story.  Consider  the  Aesop’s 
«sour  grapes)>  fable  as  a  source  story,  from 
Wharton  ct  al.  (1994,  p.  67):  «A  fox  wanted 
some  grapes,  but  couldn’t  reach  them,  so  he 
announced  to  his  friends  that  the  grapes  were 
sour  anyway»  and  Harry’s  story:  <(Harry  hoped 
to  get  a  new  position  of  marketing  manager, 
but  was  passed  over,  so  he  told  his  wife  the  job 
would  have  been  boring»  (Ibid.). 

We  consider  that  if  the  source  situation  is 
known  and  understood  by  the  participant,  it  im¬ 
plies  that,  while  reading  Aesop’s  fable,  he  or 
she  will  have  build  the  category  (or  will  have 
added  to  it,  If  it  already  exists)  of  situations  in 
which  a  thing  that  is  desired  cannot  be  obtained 
and  hence  is  denigrated.  It  is  not  improbable 
that  such  a  category  already  exists,  as  it  is  usu¬ 
al  to  notice  that  things  that  we  can’t  obtain  are 
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denigrated.  An  expression  ‘sour  grapes’  even 
exits  to  designate  such  situations.  Even  if  this 
category  does  not  exist  yet  for  the  participant, 
she  or  he  will  construct  it  while  understanding 
the  story.  Aesop’s  fable  will  become  a  typical 
member  of  the  category.  While  reading  Har¬ 
ry’s  story,  it  will  be  categorized  as  another  ex¬ 
ample  of  the  same  category  (Figure  3). 

ANALOGIES  INVOLVING  AN 
ABSTRACTION  PROCESS 

Analogy  making  is  not  always  a  straight¬ 
forward  categorization.  The  first  way  we  cate¬ 
gorize  situations  can  make  the  analogy  diffi¬ 
cult.  We  might  build,  in  a  first  step,  representa¬ 
tions  of  two  situations  without  analogical  con¬ 
nection  between  them.  We  might  also  discov¬ 
er,  after  having  attributed  to  a  target  the  prop¬ 
erties  of  a  certain  source,  that  this  analogy  is 
limiting,  as  it  is  the  case  when  a  text  editor  is 
considered  as  a  typewriter.  It  is  crucial  to  pro¬ 
vide  a  mechanism  explaining  how  this  can  be 
overcome,  that  is  how  the  analogy  can  be  dis¬ 
covered  (or  why  it  can  not,  if  that  is  the  case)  or 
how  another  source,  other  than  the  first  one 
selected,  can  be  used  if  the  first  analogy  re¬ 
vealed  itself  to  be  limiting. 

The  view  that  we  propose  for  both  cases  is 
based  on  an  ascending  search  in  the  network 
and  can  be  summarized  as  accessing  an  abstract 
source  which  is  more  adequate. 

A  task  which  requires  reaching  an  abstract 
source  is  difficult  for  at  least  three  reasons.  First, 
the  learner  has  to  discover  a  more  general  cate¬ 
gory  to  which  the  object  may  be  assimilated  (for 
instance,  in  the  context  of  text  editing,  it  is  not 
obvious  to  see  a  digit  as  a  manipulatable  object). 
Second,  the  goals  to  be  considered  at  a  superor¬ 
dinate  level  are  less  specific  (for  instance,  to  de¬ 
stroy  an  object  is  a  general  goal  which  could  be 
specified  as  burning  it,  were  it  to  be  made  of 
wood,  killing  it,  were  it  an  animal,  erasing  it, 
were  it  a  word).  Third,  the  procedures  as  well 
are  less  specific  or  are  even  lacking  so  that  the 
goal  may  be  conceived  but  not  achieved  (one 
may  have  a  goal  without  a  procedure  to  achieve 
it,  such  as  not  having  a  procedure  to  destroy  a 
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Figure  3.  Analogy  between  Aesop* s  fable  and  Harry  *s 
story. 

piece  of  metal,).  For  these  reasons,  we  predict 
that  the  more  abstract  or  general  the  source  do¬ 
main  relative  to  the  target  domain  is,  the  more 
difficult  the  analogy  becomes. 

In  the  work  of  Sander  and  Richard  (1997), 
we  have  shown  that  progress  in  learning  was 
guided  by  analogies  with  sources  of  higher  level 
of  abstraction.  We  considered  two  categories 
more  abstract  than  typewriting,  ordered  by  an 
abstraction  relation,  namely  writing  in  general 
(typewriting  is  a  specific  way  of  writing  in  gen¬ 
eral,  as  handwriting  is  another  specific  way); 
and  manipulating  objects  (we  manipulate  the 
components  of  a  text  when  we  write  it,  when 
we  correct  it,  when  we  duplicate  and  move  parts 
of  the  text  from  one  place  to  another).  We  first 
identified  the  knowledge  concerning  each  of 
those  categories  by  placing  the  participants  in 
the  relevant  context  (for  instance  manipulating 
tokens  for  the  context  of  manipulating  objects) 
and  asked  them  to  solve  tasks  isomoiphic  to 
the  ones  that  can  be  solved  on  a  text  editor  (a 
task  of  moving  a  string  of  contiguous  colored 
tokens  was  isomorphic  to  a  task  of  moving  a 
word  with  a  text-editor).  Doing  this  with  all  the 
objects  and  all  the  goals  involved  allowed  us  to 
identify  the  knowledge  about  the  hypothesized 
sources  (Figure  4.a  and  4.b)  and  to  compare  the 
learning  which  was  actually  observed  with  the 
successive  use  of  these  sources. 

Once  knowledge  about  typewriting  revealed 
itself  to  be  inadequate,  tasks  were  first  solved  by 
using  knowledge  about  writing  in  general  (i.e.,  us¬ 
ing  the  properties  associated  with  the  objects  in 
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Figure  4.a.  Network  representing  knowtedge  associated 
with  writing  in  general  and  relevant  for  text  editing. 
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Figure  4.b.  Network  representing  knowledge  associated 
with  manipulating  objects  and  relevant  for  text  editing. 

the  network  of  Figure  4.a) ,  or  if  the  writing  level 
was  inadequate,  knowledge  about  manipulating 
objects  (i.e.,  using  the  properties  associated  with 
the  objects  in  the  network  of  Figure  4.b).  It  is  in 
this  order  that  participants  progressively  discov¬ 
ered  the  properties  of  the  text  editor.  Thus,  the 
analogy  with  typewriting  is  only  the  first  step  of 
learning.  This  stage  represents  the  participant’s 
entry  into  the  semantic  network  and,  subsequent¬ 
ly,  the  entire  learning  process  revealed  to  be  guid- 
^  by  analogy  with  increasingly  higher  levels  in 
this  network.  In  the  worlc  that  we  completed  on 
learning  text  editor  functions,  we  were  able  to 
identify  very  precisely  which  semantic  network 


was  activated  by  the  device  and  the  tasks  (knowl¬ 
edge  represented  in  Figures  4a  and  4b  was  actual¬ 
ly  tested).  We  will  now  show  how  data  obtained 
In  the  framework  of  different  paradigms  on  anal¬ 
ogy  can  be  analysed  from  the  same  perspective. 

APPLICATION  TO  CLASSICAL 
SITUATIONS  OF  ANALOGY  MAKING 

Take,  for  instance,  a  classical  problem 
solved  by  analogy,  the  one  of  Archimedes,  who 
was  asked  by  his  king  to  determine  whether  a 
crown  was  pure  gold.  Because  the  per-volume 
wdght  of  gold  was  known,  it  would  have  been 
easy  to  provide  the  answer  if  the  volume  of  the 
crown  was  known.  However,  the  crown  was  too 
ornate  to  measure  its  volume.  Archimedes  solved 
the  problem  while  bathing.  He  noticed  that  the 
volume  of  water  displaced  by  his  body  was  equal 
to  the  volume  of  his  body,  so  the  same  should 
hold  true  for  the  crown.  In  our  view,  the  crucial 
point  in  this  analogy  is  that  the  crown,  as  a  body, 
is  seen  as  having  the  very'  general  property  of  all 
concrete  objects,  that  is,  in  water,  they  displace 
a  volume  equal  to  their  own.  At  the  specific  lev¬ 
el,  there  arc  very  few  similarities  between  a  body 
and  a  crown.  Thus,  the  solution  can  not  be  found 
with  that  specific  analogy,  but  at  a  more  general 
level,  the  one  of  concrete  objects,  the  relevant 
analogy  can  be  drawn  (Figure  5). 

The  crown  has  to  be  considered  as  a  con¬ 
crete  object,  which  implies  neglecting  its  spe¬ 
cific  properties  such  as  symbol  of  kingship, 
made  of  precious  metal,  etc..  As  well,  the  hu¬ 
man  body  Is  a  living  body  and  has  to  be  con¬ 
sidered  as  a  lifeless  body  to  be  put  in  the  same 
category  with  the  crown. 

Gick  and  Holyoak  (1980)  consider  story 
analogs  using  Duncker’s  ( 1 945)  radiation  prob¬ 
lem  in  which  a  tumor  has  to  be  destroyed  by 
rays  without  destroying  healthy  tissues.  In  what 
is  called  the  convergence  solution,  in  which 
several  low  intensities  rays  are  directed  toward 
the  tumor  from  different  directions,  a  basic  dif¬ 
ficulty  is  to  «think  of  rays  as  having  the  proper¬ 
ty  of  divisibility»  (p.  318).  In  our  view,  a  good 
candidate  for  a  spontaneous  analogy  with  rays 
involved  in  the  experiment  would  be  a  ray  of 
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Figure  5.  Archimedes*  analogy. 


Figure  6.  Analogy  with  the  genie* s  story. 


light:  at  this  level,  the  relevant  properties  are 
«the  intensity  can  vary»  and  «it  can  be  directed 
in  different  ways».  So  we  can  predict  that  a  large 
number  of  participants  will  produce  solutions 
using  these  properties.  Producing  a  convergence 
solution  requires  considering  the  rays  as  hav¬ 
ing  the  property  of  divisibility.  We  can  predict 
that  it  will  be  more  difficult  to  produce  a  con¬ 
vergence  solution  because  the  property  of  di¬ 
visibility  is  not  attributed  to  a  ray  of  light.  This 
prediction  is  supported  by  Dunckeris  (1945) 
results  concerning  participants  who  produced 
solutions  without  receiving  solutions  to  analog 
problems:  5%  spontaneously  produced  the  con¬ 
vergence  solution  versus  29%  who  produced 
the  open  passage  solution  (of  putting  a  tube  in 
the  esophagus),  and  40%  produced  a  kind  of 
operation  solution  (of  creating  a  tunnel  in 
healthy  tissues).  The  last  two  solutions  do  not 
require  considering  the  property  of  divisibility 
but  only  the  fact  that  rays  can  be  freely  orient¬ 
ed.  Moreover,  we  can  predict  that  if  the  target 
is  changed  from  a  ray  to  an  object  that  natural¬ 
ly  has  the  property  of  divisibility,  the  frequen¬ 
cy  of  the  convergence  solution  will  increase 
because  the  first  level  reached  would  be  the 
divisible-object  level.  Gick  and  Holyoak  (1980) 
reported  that  Duncker  found  an  increase  in  the 
frequency  of  the  convergence  solution  when  the 
term  used  was  particles  instead  of  rays.  Con¬ 
trary  to  a  ray,  a  natural  property  of  a  group  of 
particles  is  divisibility,  so  there  is  no  longer  a 
need  to  reach  a  more  abstract  level. 

If  the  participants  are  provided  with  the 
army  analog  as  a  source  (in  this  story,  a  for¬ 
tress  has  to  be  captured  by  an  army  without  the 
army  being  destroyed  by  mines),  to  draw  the 
correct  analogy  (the  convergence  solution),  the 


army  has  to  be  regarded  as  being  composed  of 
separable  parts  (soldiers  or  groups  of  soldiers) 
and  the  moving  of  the  army  must  be  consid¬ 
ered  as  the  rrioving  of  as  many  parts.  A  ray  can¬ 
not  be  divided  into  parts  like  a  solid  object  with 
unconnected  parts;  it  divides  by  division  of  the 
ray  sources.  This  requires  accessing  a  more 
abstract  property  of  division,  which  includes 
both  division  by  separating  into  parts  and  divi¬ 
sion  by  dividing  the  source.  For  this  reason,  it 
can  be  predicted  that  it  will  be  more  difficult  to 
produce  the  convergence  solution.  As  a  matter 
of  fact,  if  participants  are  given  a  source  such 
as  the  military  problem  in  Which  the  divisibili¬ 
ty  of  the  army  is  the  relevant  feature  for  the 
solution,  and  no  hint  to  use  this  story,  the  con¬ 
vergence  solution  is  seldom  produced  (Gick  & 
Holyoak,  1980). 

A  study  by  Holyoak,  Junn,  and  Billman 
(1984)  also  provides  supporting  evidence  that 
the  difficulty  of  the  task  increases  with  the  lev¬ 
el  of  abstraction  required.  Children  had  to  de¬ 
vise  as  many  ways  as  possible  of  transfering 
balls  from  one  bowl  to  another.  A  source  ana¬ 
log  was  a  genie  who  ordered  his  magic  carpet 
to  roll  up  into  a  tube  and  then  used  it  as  a  bridge 
to  transfer  jewels  from  one  bottle  to  another. 
Among  the  materials  provided,  there  was  a  tube 
and  a  sheet  of  heavy  paper.  According  to  the 
authors’  analysis,  a  key  factor  is  the  «rollabili- 
ty»  of  the  materials.  A  tube  is  obviously  rolla- 
ble,  because  it  is  already  rolled,  but  a  sheet  of 
paper  is  used  actually  only  for  writing  or  draw¬ 
ing,  so  considering  it  as  an  object  which  can  be 
rolled  requires  regarding  it  as  belonging  to  the 
more  general  class  of  rollable  objects  and  to 
neglect  the  property  of  being  usable  to  write 
on.  Thus,  the  mapping  of  a  sheet  of  paper  with 
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the  magic  caipct  can  be  done  only  at  quite  an 
abstract  level  (Figure  6)  and  it  can  be  predicted 
that  the  tube  solution  will  be  easier  than  the 
sheet  solution.  Indeed,  most  of  the  children 
spontaneously  used  the  tube,  even  without  a 
story  analog,  but  only  a  few  participants  in  the 
story  group  produced  the  analogous  rolled  pa¬ 
per  solution,  even  after  a  hint. 

CONCLUSION 

The  main  implications  of  our  view  that 
we  wish  to  undeline  are  the  following,  (a)  As 
we  consider  analogy  as  a  categorization  pro¬ 
cess,  we  are  able  to  treat  situations  in  which 
the  person  has  a  very  crude  representation  of 
the  target.  The  source  participates  in  the  en¬ 
coding  of  the  situation,  (b)  As  our  view  in¬ 
volves  an  abstraction  mechanism  which  per¬ 
mits  to  predict  how  the  representation  of  the 
target  will  evolve,  we  can  treat  the  issue  of 
rerepresentation,  that  is  how  analogy  can  be 
used  to  deeply  change  a  representation  of  a 
new  situation  and  not  only  to  add  a  few  new 
relations  to  an  existing  representation,  (c)  We 
provide  a  formalism  in  which  semantic  (the 
network  is  a  semantic  network),  pragmatic 
(goal  related  aspects  are  considered  as  prop¬ 
erties  of  objects)  and  structural  (the  structure 
of  the  network  guides  the  analogy  mecha¬ 
nism)  aspects  are  integrated  in  the  same  net¬ 
work.  (d)  As  the  structure  of  the  network  con¬ 
strains  the  process,  it  provides  a  constraint 
system  that  limits  combinatory  explosion,  (e) 
Semantic  aspects  are  central  in  our  view  be¬ 
cause  semantic  knowledge  is  used  not  only 
to  decide  if  some  objects  have  to  be  mapped 
but  actually  to  infer  knowledge  about  the  new 
situation.  This  fits  well  with  recent  results 
(e.g.,  Bassok  &  Olseth,  1995)  showing  that 
superficial  aspects  of  a  situation  are  used  to 
infer  its  structure. 

REFERENCES 

Anderson,  J.  R.  (1991).  The  adaptative  nature 
of  human  categorization.  Psychological 
Review,  98,  409-429. 


Barsalou,  L.W.  (1982).  Context-independent 
and  context-dependent  information  in 
concepts.  Memory  and  Cognition,  10, 
82-93. 

Barsalou,  L.W.  (1991).  Deriving  categories 
to  achieve  goals.  In  G.  H.  Bower  (Ed.), 
The  psychology  of  learning  and  moti¬ 
vation,  Vol.  27(pp.  1-64).  New-York: 
Academic  Press. 

Bassok,  M.,  &  Olseth,  K.L.  (1995).  Object- 
based  representations:  Transfer  be¬ 
tween  cases  of  continuous  and  discrete 
models  of  change.  Journal  of  Experi¬ 
mental  Psychology:  Learning,  Memo¬ 
ry  and  Cogntion,  21,  1522-1538. 

Duncker,  K.  (1945).  On  Problem  Solving. 
Psychological  Monographs,  58, 
Whole  N?  270. 

Centner,  D.  (1983).  Structure-mapping:  a  the¬ 
oretical  framework  for  analogy.  Cogni¬ 
tive  Science,  7,  155-170. 

Centner,  D.  (1988).  Metaphor  as  stucture  map¬ 
ping:  the  relational  shift.  Child  develop¬ 
ment,  59,  47-59. 

Gick,  M.  L.,  &  Holyoak,  K.  J.  (1980).  Analog¬ 
ical  problem  solving.  Cognitive  Psychol¬ 
ogy,  12,  306-355. 

Glucksberg,  S.,  &  Keysar,  B.  (1990).  Under¬ 
standing  metaphorical  comparisons: 
beyond  similarity.  Psychological  Re¬ 
view,  97,  3-18. 

Hofstadter,  D.  (1995).  Fluid  concepts  and  cre¬ 
ative  analogies.  New  York:  Basic  Books. 

Holyoak,  K.  J.,  Junn,  E.  N.,  Sc  Billman  D.  O. 
(1984).  Development  of  analogical  prob¬ 
lem-solving  skill.  Child  Development, 
55,  2042-2055. 

Holyoak,  K.  J.,  Sc  Thagard,  P.  (1995).  Mental 
leaps:  Analogy  in  creative  thought. 
Cambridge,  MA:  The  MIT  press. 

Kokinov,  B,,  Sc  Yoveva,  M.  Context  effects 
on  problem  solving.  In  Proceedings  of 
the  1 8th  Annual  Conference  of  the 
Cognitive  Science  Society.  Erlbaum, 
Hillsdale,  NJ. 

Mitchell,  M.  (1993).  Analogy  making  as  per¬ 
ception:  A  computer  model.  Cambridge, 
MA:  MIT  Press. 


388 


Analogy  making  as  a  categorization  and  an  abstraction  process 


Murphy,  G.L.,  &  Medin,  D.L.  (1985).  The  role 
of  theories  in  conceptual  coherence.  Psy¬ 
chological  review,  92,  289-316. 

Morris,  M.W.,  &  Murphy,  G.L.  (1990).  Con¬ 
verging  operations  on  a  basic  level  in 
event  taxonomies.  Memory  &.  Cognition, 
18,  407-418. 

Poitrenaud,  S.  (1995).  The  PROCOPE  seman¬ 
tic  network:  An  alternative  to  action 
gxdmmds%.  International  Journal  of  Hu¬ 
man-Computer  Studies,  42,  31-69. 

Richard,  J-F.,  &  Tyus,  C.A.  (1998),  Modelling 
the  affordance  of  objects  in  problem 
solving.  Analise  Psycologia.  Special  is¬ 
sue  on  cognition  and  context. 

Rosch,  E.  (1978).  Principles  of  categorization. 
In  E.  Rosch  and  B.B.  Lloyd  (Eds.),  Cog¬ 
nition  and  categorization  (pp.  27-48). 
Hillsdale,  NJ:  Erlbaum. 

Ross,  B.H.,  &  Bradshaw,  G.L.  (1994).  Encod¬ 
ing  effects  of  remindings.  Memory  & 
Cognition,  22,  591-605. 

Sander,  E.  (1997).  Analogie  et  Categorisation. 
PhD  thesis,  University  of  Paris8,  Saint- 
Denis,  France. 

Sander,  E.,  &  Richard,  J-F.  (1997).  Analogical 
transfer  as  guided  by  an  abstraction  pro¬ 
cess:  The  case  of  learning  by  doing  in 
text  editing.  Journal  of  Experimental 
Psychology:  Learning,  Memory,  and 
Cognition,  23,  1459-1483. 

Smith,  E.E.,  &  Medin,  D.L.  (1981).  Categories 
and  concepts.  Cambridge:  Harvard  Uni¬ 
versity  Press. 


Spalding,  T.  &  Murphy,  G.L.  (1996).  Effects 
of  background  knowledge  on  category 
construction.  Journal  of  Experimental 
Psychology:  Learning,  Memory  &  Cog¬ 
nition,  22,  525-538, 

Tanaka,  J.W.,  &  Taylor,  M.  (1991).  Object  cat¬ 
egories  and  expertise:  Is  the  basic  level 
,  in  the  eye  of  the  beholder?  Cognitive 
Psychology,  23,  457-482. 

Turner,  M.  (1988).  Categories  and  analogies. 
In  D.  H.  Helman  (Ed.),  Analogical  Rea¬ 
soning  (pp  3-24).  Kluwer  Academic 
Publishers. 

Tversky,  B.,  &  Hemenway,  K.  (1983).  Catego¬ 
ries  of  environmental  scenes.  Cognitive 
Psychology,  15,  121-149. 

Vosniadou,  S.  (1989).  Analogical  reasoning  as 
a  mechanism  in  knowledge  acquisition: 
a  developmental  perspective.  In  S.  Vos¬ 
niadou  &  A.  Ortony  (Eds,),  Similarity 
and  analogical  reasoning  (pp.  4 1 3-437). 
Cambridge:  Cambridge  University 
Press. 

Wharton,  C.M.,  Holyoak,  K.J,  Downing,  P.E., 
Lange,  T.E.,  Wickens,  T.D.,  &  Melz, 
E.R.  (1994).  Below  the  surface:  Analog¬ 
ical  similarity  and  retrieval  competition 
in  reminding.  Cognitive  Psychology,  26, 
64-101. 

Wisniewski,  E.J.  (1995).  Prior  knowledge  and 
functionally  relevant  features  in  concept 
learning.  Journal  of  Experimental  Psy¬ 
chology:  Learning,  Memory  &  Cogni¬ 
tion,  21,449-46%. 


389 


WITTGENSTEIN  AND  THE  ONTOLOGICAL  STATUS  OF 

ANALOGY 


Michael  Ramscar 

Department  of  Artificial  Intelligence  University  of  Edinburgh  M.J.A.Ramscnr@cd.ac.uk 

Ulrike  Hahn 

Department  of  Psychology  University  of  Warwick  U.Hahn@warwick.ac.uk 


ABSTRACT 

Analogy  has  traditionally  been  defined 
in  terms  of  a  contrast  definition:  analogies 
represent  connections  between  things  which 
are  distinct  from  the  ‘normal’  connections  de¬ 
termined  by  our  ‘ordinary*  concepts  and  cat¬ 
egories.  A  similar  state  of  affairs  holds  in  the 
case  of  metaphor.  In  order  for  definitions 
such  as  this  to  carry  weight,  an  account  of 
what  constitutes  an  association  between  two 
things  siich  that  they  are  members  of  the  same 
category  rather  than  different  ones  is  need¬ 
ed.  In  this  paper,  we  explore  the  possibility 
.  that  categorisation  research  might  not  be  able 
to  formulate  a  story  about  categories  that 
yields  the  kind  of  unitary  theoretical  account 
that  definitions  of  analogy  and  metaphor 
would  seem  to  require.  In  particular,  we  fo¬ 
cus  on  Wittgenstein’s  analysis  of  concepts 
and  categories  in  the  Philosophical  Investi¬ 
gations  (1953),  and  the  challenges  this  anal¬ 
ysis  presents  for  contemporary  accounts  of 
categorisation.  We  then  look  at  how  far  cur¬ 
rent  accounts  of  categorisation  can  go  to¬ 
wards  meeting  these  challenges,  and  in  the 
light  of  this,  we  evaluate  the  kind  of  onto¬ 
logical  status  that  analogy  (and  by  extension 
metaphor)  should  be  given  in  studies  of  cog¬ 
nition.  Should  analogy  and  metaphor  be  seen 
as  a  separate  process,  definable  in  contrast 
to  categorisation,  or  should  analogy,  meta- 
phdr  and  categorisation  instead  all  be  viewed 
in  a  wider  context,  as  manifestations  of  the 
same  underlying  process? 


INTRODUCTION 

TTie  belief  that  analogy  and  categorisation 
are  distinct  and  separable  cognitive  processes 
is  widespread:  in  the  pursuif  of  our  everyday 
lives  we  accept  without  question  an  ontology 
that  distinguishes  between  litcrality  -  saying 
what  something  ‘really’  is  -  arid  analogies  and 
metaphors,  which,  however  informative  they 
may  be,  are  nevertheless  not  considered  to  make 
‘real’  statements  about  the  world.  We  may  talk 
of  “the  foundations  of  a  theory”;  we  may  wish 
to  “buttress  a  theory  with  more  facts”;  we  may 
accept  that  “theories  we  construct  can  also  col¬ 
lapse”,  but  from  our  everyday  viewpoints,  an 
igloo  and  a  castle  and  a  skyscraper  appear  to 
share  a  real  relationship  that  buildings  and  the¬ 
ories  do  not.  We  can  talk  of  someone’s  foxy 
cunning  without  really  meaning  to  imply  a  di¬ 
rect  equation  between  the  cognition  of  foxes 
and  humans  when  it  comes  to  being  cunning. 
French  (1995)  describes  how  his  suggestion  - 
to  an  academic  audience  -  that  an  upturned  or¬ 
ange-crate,  when  covered  with  a  cloth  and  laid 
out  with  a  picnic,  might  really  be  described  as 
a  table  met  with  the  uncompromising  response, 
“An  orange  crate  is  an  orange  crate  is  an  or¬ 
ange  crate?”  The  attachment  to  pre-thcoretical 
intuitions  is  a  strong  one,  even  amongst  those 
who  seek  to  explore  and  explain  them. 

Research  into  categorisation,  analogy  and 
metaphor  has  usually  tacitly  accepted  this  real¬ 
ism.  Holyoak  and  Thagard  (1995)  describe  a 
world  in  which  “we  think  we  see  things  as  they 
really  are”,  and  analogy  is  used  in  order  to  re- 
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cycle  our  existing  knowledge  of  the  real  world 
to  formulate  new  bits  of  ‘real’  knowledge.  In 
the  literature,  analogy  is  consistently  defined 
in  contrast  to  categorisation  (Clement  and  Cen¬ 
tner,  1991;  Holyoak  and  Thagard,  1995),  for 
example,  Holyoak  and  Thagard  (1995,  p217) 
describe  analogy  and  metaphor  as  things  that 
connect  “two  domains  in  a  way  that  goes  be¬ 
yond  our  normal  category  structure”. 

In  Order  to  make  a  contrast  definition  stick, 
one  needs  an  account  of  at  least  one  of  the  con¬ 
trasting  elements.  Thus,  when  an  analogy  is 
defined  as  an  associative  judgement  between 
two  things  that  are  in  different  categories,  what 
is  needed  is  an  account  of  what  constitutes  an 
association  between  two  things  such  that  they 
are  members  of  the  same  category  rather  than 
different  ones.  In  this  paper,  we  explore  the 
possibility  that  categorisation  research  might 
not  be  able  to  formulate  a  story  about  catego¬ 
ries  that  yields  the  kind  of  unitary  theoretical 
account  that  definitions  of  analogy  would  seem 
to  require.  In  particular,  we  focus  on  Wittgen¬ 
stein’s  analysis  of  concepts  and  categories  in 
the  Philosophical  Investigations  (1953;  PI),  and 
the  challenges  this  analysis  presents  for  con¬ 
temporary  accounts  of  categorisation.  We  shall 
then  look  at  how  far  current  accounts  of  cate¬ 
gorisation  can  go  towards  meeting  these  chal¬ 
lenges,  and  in  the  light  of  this,  we  shall  evalu¬ 
ate  the  kind  of  ontological  status  that  analogy 
(and  by  extension  metaphor)  should  be  given 
in  the  cognitive  pantheon.  We  shall  argue  rath¬ 
er  than  viewing  analogy  as  a  separate  process, 
definable  in  contrast  to  categorisation,  both 
analogy  and  categorisation  might  better  be  seen 
in  a  wider  context,  as  manifestations  of  the  same 
underlying  process.  ^ 

Wittgenstein  and  categorisation 

Previously  (Ramscar,  1997;  Ramscar  & 
Hahn,  1998)  we  have  examined  in  detail  the  ve¬ 
racity  of  the  interpretation  of  Wittgenstein’s  view 
that  is  commonly  held  by  researchers  studying 
categorisation,  comparing  it  with  a  detailed  ex¬ 
position  of  Wittgenstein’s  arguments.  Although 
Wittgenstein  is  often  presented  as  an  opaque, 


difficult  to  interpret,  and  rather  obscure  philos¬ 
opher  -  sometimes  leading  to  the  Philosophical 
Investigations  being  seen  as  a  philosophical  pick 
‘n’  mix,  a  series  of  gnomic  quotables  to  be  plun¬ 
dered  in  support  of  a  thesis  -  we  have  argued 
that  PI  sections  §66  to  §82  actually  lay  out  a  clear, 
if  intricately  connected,  series  of  arguments  de¬ 
tailing  Wittgenstein’s  theoretical  treatment  of 
categories  and  categorisation  in  a  fairly  straight¬ 
forward  manner. 

The  picture  that  emerges  from  a  close  read¬ 
ing  of  Wittgenstein’s  text  is  at  considerable 
variance  with  the  general  understanding  of 
Wittgenstein’s  position  within  cognitive  sci¬ 
ence,  a  nicely  summarised  account  of  which  is 
presented  by  Lakoff  (1987a;  accounts  which 
concur  broadly  with  this  can  be  found  in 
Johnson-Laird,  1983;  Medin  &  Ortony,  1989; 
Komatsu,  1992).  Lakoff  acknowledges  Wit¬ 
tgenstein  as  the  first  theorist  to  notice  what  he 
terms  a  major  crack  in  the  classical  theory  of 
concepts  and  categories  (e.g.  Katz,  1972).  Wit¬ 
tgenstein,  claims  Lakoff,  argues  that  categories 
such  as  game  cannot  be  accounted  for  accord¬ 
ing  to  classical  theories  because  there  are  no 
properties  that  are  common  to  all  games.  La¬ 
koff  draws  two  key  theses  from  this  argument: 

1:  “Games,  like  family  members  are  similar  to 
one  another  in  a  variety  of  ways”;  and 

2:  “That  [family  resemblances],  and  not  a  sin¬ 
gle  well  defined  collection  of  common 
properties  is  what  makes  game  a  category” 
(Lakoff,  1987a,  pp  16-17) 

Whilst  1  is  an  uncontentious  statement  of 
Wittgenstein’s  views,  2  is  a  rather  more  diffi¬ 
cult  inteipretation  to  sustain.  In  PI  §66  (p  3 1 ) 
Wittgenstein  explicitly  states  that  ‘you  will  not 
see  something  that  is  common  to  all  [games]’. 
Rather,  he  argues  that  what  games  have  in 
common  is  the  now  notorious  family  resem¬ 
blances:  ‘a  complicated  network  of  similari¬ 
ties  overlapping  and  criss-crossing:  sometimes 
overall  similarities,  sometimes  similarities  of 
detail’  (PI,  p  32).  Lakoff,  (and  cognitive  sci¬ 
entists  in  general)  take  this  to  be  Wittgenstein’s 
characterisation  of  what  a  category  is.  What 
seems  to  escape  previous  interpreters  is  the 
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extreme  negativity  of  this  characterisation.  In 
PI  §67  (pp  3 1  -2)  Wittgenstein  explicitly  con¬ 
demns  this  characterisation  of  naming  cate¬ 
gories  as  vacuous.  Saying  that  the  common 
theme  that  runs  through  a  category  is  the  con¬ 
tinual  overlap  of  family  resemblances  is  di¬ 
rectly  analogous  to  saying  that  the  common 
thing  that  runs  through  a  thread  is  continuous 
overlapping  of  the  fibres  that  make  up  the 
thread,  and  Wittgenstein  dismisses  both  of 
these  accounts  as  empty  gestures;  ‘Now  you 
are  only  playing  with  words’  (PI  p  32).  There 
is,  he  says,  no  thing  that  runs  through  a  thread 
in  the  form  of  overlapping  fibres;  a  thread  sim¬ 
ply  is  a  series  of  overlapping  fibres.  His  view 
is  a  serious  challenge  to,  rather  than  an  en¬ 
dorsement  of,  Lakoffs  formulation:  if  family 
resemblances  are  the  common  thing  that  run 
through  game,  just  as  overlapping  fibres  are 
the  common  thing  that  run  through  a  thread, 
then  what  is  this  thing  supposed  to  be?  How 
is  it  supposed  to  do  whatever  it  is  it  is  sup¬ 
posed  to  do?  How  long,  Wittgenstein  asks,  Is 
a  piece  of  string? 

Naming  and  boundaries  -  the  length  of  a 
string 

The  question  of  ‘how  long  is  a  piece  of 
string?’  becomes  important  once  the  second 
part  of  Lakoffs  exposition  is  introduced.  Wit¬ 
tgenstein,  as  Lakoff  notes,  argues  that  the 
boundaries  of  categories  are  not  fixed,  com¬ 
menting 

68.Wittgenstein  and  the  Ontological  Sta¬ 
tus  of  Analogy 

“A//  right:  the  concept  of  number  is  defined 
for  you  as  the  logical  sum  of  these  individual  in¬ 
terrelated  concepts:  cardinal  numbers,  rational 
numbers,  real  numbers,  etc. ;  and  in  the  same  way 
the  concept  of  a  game  is  the  logical  sum  of  a  cor¬ 
responding  set  of  sub-concepts.  ”  -  It  need  not  be 
so.  For  I  can  give  the  concept  'number'  rigid  lim¬ 
its  in  this  way,  that  is  use  the  word  "number"  for 
a  rigidly  limited  concept,  but !  can  also  use  it  so 
that  the  extension  of  the  concept  is  not  closed  by 
a  frontier..  (Wittgenstein  J953,  p32-3). 

Lakoff  interprets  this  discussion  of  num¬ 
ber  as  follows:  historically,  numbers  were  first 
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taken  to  be  integers,  and  then  ‘numbers*  were 
successively  extended  to  include  rational  num¬ 
bers,  real  numbers,  complex  numbers,  transfi- 
nite  numbers,  and  all  of  the  other  numbers  that 
mathematicians  are  wont  to  invent.  But  the  con¬ 
cept  of  ‘number*  is  not  bounded  in  any  natural 
way,  and  it  can  be  limited  or  extended  depend¬ 
ing  upon  one’s  circumstances  and  purposes. 
Lakoff  says  that  in  mathematics,  intuitive  hu¬ 
man  concepts  like  number  must  receive  pre¬ 
cise  definitions:  Wittgenstein’s  point,  he  claims, 
is  that  different  mathematicians  give  different 
definitions,  depending  upon  their  goal.  Thus 
although  the  category  number  can  be  given  pre¬ 
cise  boundaries  in  many  ways,  ‘the  intuitive 
concept  is  not  limited  in  any  of  those  ways;  rath¬ 
er,  it  is  open  to  both  limitations  and  extensions’ 
(Lakoff,  1987a,  pp17). 

The  key  question,  on  Lakoffs  account,  is 
how  those  limitations  and  extensions  are  gov¬ 
erned  -  what  factors  determine  the  boundaries 
of  categories  in  given  circumstances.  Lakoff 
answers  this  question  in  relation  to  game  by 
saying  that  game’s  boundaries  are  governed  by 
resemblance  to  previous  games  in  appropriate 
ways:  a  new  thing  can  be  a  game  if  it  is  suitably 
similar  to  previous  games.  Lakoff  cites  the  in¬ 
troduction  of  video  games  in  the  1970s  as  a  re¬ 
cent  example  of  the  boundaries  of  the  game 
category  being  extended  on  a  large  scale. 

Again,  discrepancies  can  be  distinguished 
between  Lakoffs  characterisation  of  Wittgen¬ 
stein’s  views  and  the  content  of  Wittgenstein’s 
stated  arguments.  In  §68,  Wittgenstein  says  that 
one  ‘can  give  the  concept  ‘number’  rigid  limits 
in  this  way,  that  is  use  the  word  “number”  for  a 
rigidly  limited  concept,’  -  Lakoffs  claim  that 
in  mathematics  number  must  receive  precise 
definitions  appeals  to  this  -  ‘but  I  can  also  use 
it  so  that  the  extension  of  the  concept  is  not 
closed  by  a  frontier.’  Here,  Wittgenstein  is  not 
talking  about  the  extensibility  of  borders,  but 
something  far  more  radical:  ‘You  can  draw  (a 
boundary],  for  none  has  so  far  been  drawn.  (But 
that  never  troubled  you  when  you  used  the  word 
“game”  before)’  (PI  pp  32-3).  Wittgenstein  isn’t 
talking  here  about  the  extensibility  of  bound¬ 
aries;  he  is  talking  about  their  absence,  a  point 
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developed  in  PI  §69  to  §73:  categories  do  not 
have,  or  need,  boundaries  at  all.  In  the  context 
of  Wittgenstein’s  overall  discussidn  of  catego¬ 
ries,  this  is  a  vitally  important  point:  it  is  one 
thing  to  seek  to  determine  the  length  of  a  piece 
of  string  whose  length  is  not  fixed  (we  might 
add  a  temporal  dimension  to  our  answer  for  in¬ 
stance);  it  is  quite  another  thing  to  seek  to  find 
out  how  long  a  piece  of  string  is  when  the  string 
is  of  no  particular  length  at  all. 

Here,  Wittgenstein  is  emphatic  (PI  §69): 
one  can  draw  a  boundary,  for  a  special  purpose, 
but  it  is  just  that,  a  drawn  boundary.  Important 
in  the  context  of  the  special  purpose,  no  doubt, 
but  arbitrary  to  the  concept  or  category  in  ques¬ 
tion.  We  do  not  need  to  draw  boundaries;  be¬ 
cause  we  can  happily  use  concepts  where  no 
boundary  has  been  drawn;  thus  categories  do 
not  need  boundaries  to  be  usable.  To  further 
iterate  this  point,  Wittgenstein  considers  the 
state  of  a  user  of  a  category  (concept)  who  can¬ 
not  specify  that  category’s  boundaries:  is  the 
user  ignorant  of  those  boundaries?  -  No,  she 
does  hot  ‘know  the  boundaries  because  none 
have  been  drawn*  (PI,  p33).  Not  knowing  the 
boundaries  of  game  is  not  a  state  of  ignorance  - 
it  is  just  reflective  of  the  boundariless  state  of 
the  category  game. 

11. One  might  say  that  the  concept  *game* 
is  a  concept  with  blurred  edges.  -  ''But  is  a 
blurred  concept  a  concept  at  all?**  ,-  tin  in¬ 
distinct  photograph  a  picture  of  a  person  at  all? 
Is  it  even  always  an  advantage  to  replace  an 
indistinct  picture  by  a  sharp  one?  Isn*t  the  in¬ 
distinct  one  often  exactly  what  we  need? 

Frege  compares  a  concept  to  an  area  and 
says  that  an  area  without  boundaries  cannot 
be  called  an  area  at  all.  This  presumably 
means  that  we  cannot  do  anything  with  it.  - 
But  is  it  senseless  to  say:  "Stand  roughly 
there"?  Suppose  that  I  were  standing  with 
someone  in  a  city  square  ahd  said  that.  As  I 
say  it  I  do  not  draw  any  kind  of  boundary, 
but  perhaps  point  with  my  hand  -  as  if  I  were 
indicating  a  particular  spot.  And  this  is  just 
how  one  might  explain  to  someone  what  a 
game  is.  One  gives  examples  and  intends 
them  to  be  taken  in  a  particular  way.  - 1  do 


not,  howevet,  mean  by  this  he  is  supposed  to 
see  in  those  examples  that  common  thing  that 
I  -  for  some  reason  -  was  unable  to  express; 
but  that  he  is  now  going  to  employ  those  ex¬ 
amples  in  a  particular  way.  Here,  giving  ex¬ 
amples  is  not  an  indirect  means  of  explain¬ 
ing  -  in  default  of  a  better.  For  any  general 
definition  can  be  misunderstood  too.  The 
point  is  that  this  is  how  we  play  the  game.  (I 
mean  the  language  game  with  the  word 
"game".)  (Wittgenstein  J953,  p34). 

Wittgenstein’s  rejection  of  boundaries  - 
and  not  just  the  idea  of  fixing  upon  this  bound¬ 
ary  rather  than  that  one  -  seems  to  be  both  clear 
and  unambiguous.  We  don’t  have  to  define 
boundaries  in  order  to  use  concepts,  nor  is  it 
clear  that  definite  boundaries  are  always  what 
we  need;  these  points  can  be  further  drawn  out 
if  we  contemplate  §7 1  in  conjunction  with  §76: 

76.  If  someone  were  to  draw  a  sharp  bound¬ 
ary  I  could  not  acknowledge  it  as  the  one  that  / 
too  always  wanted  to  draw,  or  had  drawn  in 
my  mind.  For  I  did  not  want  to  draw  one  at  all. 
His  concept  can  be  said  to  be  not  the  same  as 
mine,  but  akin  to  it.  The  kinship  is  that  of  two 
pictures,  one  of  which  consists  of  colour  patches 
with  vague  contours,  and  the  other  of  patches 
similarly  shaped  and  distributed,  but  with  clear 
contours.  The  kinship  is  Just  as  undeniable  as 
the  difference.  (Wittgenstein  1953,  p36). 

Categories  do  not  have  boundaries,  and  by 
defining  boundaries  we  do  not  capture  theke 
categories,  we  create  something  new  -  call  them 
bounded  categories  (in  §68,  Wittgenstein  calls 
them  ‘rigidly  limited’  concepts,  So  we  might 
call  a  bounded  game  a  rigidly  limited  game)  - 
which  have  some  kind  of  kinship  with  our  nat¬ 
ural  naming  categories  (e.g.  game),  but  a  rigid¬ 
ly  limited  game  is  markedly  and  importantly 
different  to  game. 

To  return  to  family  relations,  these  are  the 
fibres  that  make  up  the  threads  that  are  catego¬ 
ries:  but  Wittgenstein  explicitly  states  that  the 
length  of  these  threads  cannot  be  determined. 

Wittgenstein  argues  that  in  Explaining 
what  game  is,  one  gives  examples  of  instances 
game,  and  one  intends  those  examples  to  be 
taken  in  a  particular  way.  What  one  does  not 
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do  is  expect  the  person  to  whom  one  is  ex¬ 
plaining  *game*  to  see  the  common  thing  - 
whether  it  be  a  core,  schema  or  essence  - 
which  one  cannot  actually  see  oneself.  It  is 
true,  says  Wittgenstein,  that  when  we  give 
these  examples  our  subject  might  see  kinships 
between  the  examples,  but  these  kinships  are 
not  in  any  way  essential.  Giving  these  exam¬ 
ples,  says  Wittgenstein,  is  not  an  indirect 
explanation;  it  is  the  explanation.  We  don’t 
give  a  genera!  definition,  but  this  is  not  be¬ 
cause  we  can’t  think  of  one,  but  because  there 
is  none  to  give. 

72.  Seeing  what  is  common.  Suppose  !  show 
someone  various  mutti-coioured  pictures,  and 
say:  *The  colour  you  see  in  all  these  is  called 
yellow  ochre*",  -  This  is  a  definition,  and  the 
other  will  get  to  understand  it  by  looking  for  and 
seeing  what  is  common  to  the  pictures.  Then  he 
can  look  at,  and  point  to,  the  common  thing 

Compare  this  with  a  case  where  !  show  him 
figures  of  different  shapes  all  painted  the  same 
colour,  and  say:  "What  these  have  in  common 
is  called  yellow  ochre*". 

And  compare  this  case:  /  show  him  sam¬ 
ples  of  different  shades  of  blue  and  say:  "The 
colour  that  is  common  to  all  these  is  what !  call 
*blue*",  (Wittgenstein  1953,  p34). 

It  is  not  just  that  there  is  no  single  ‘thing,* 
common  to  all:  Wittgenstein  questions  the  way 
that  ‘commonalities*  are  supposed  to  be  gar¬ 
nered  in  the  first  place.  In  the  first  example  in 
§72  above,  the  commonality  is  easy  to  spot: 
provided  the  only  common  colour  in  the  pic¬ 
tures  was  yellow  ochre,  and  provided  that  the 
subject  had  grasped  the  meaning  of  colour,  then 
she  will  be  able  to  grasp  what  yellow  ochre  is  - 
the  colour  that  is  common  in  all  the  pictures. 

In  example  two,  the  subject  could  not  pro¬ 
ceed  in  the  same  way:  although  the  figures  all 
have  colour  (yellow  ochre)  in  common,  they  also 
have  other  commonalities,  such  as  being  figures. 
Thus  the  subject  could  as  easily  learn  to  apply 
‘yellow  ochre*  to  yellow  ochre  or  to  figures,  or 
even  to  samples  (all  of  the  samples  are  ‘sam¬ 
ples*  after  all)  from  this  example.  Nothing  in  the 
definition  picks  out  the  particular  commonality 
that  ‘yellow  ochre*  is  supposed  to  pick  out' 


Finally,  In  example  three,  there  is  no  a  pri¬ 
ori  colour  commonality  to  the  pictures;  rather, 
the  commonality  can  only  be  perceived  if  one 
already  has  the  concept  ‘blue*  (Otherwise,  one 
would  see  a  riot  of  various  ‘colours’;  since  un¬ 
derstanding  this  example  is  dependent  upon  an 
understanding  of  ‘blue’,  the  example  could  not 
serve  as  an  explanation  of,  or  a  definition  of 
•blue*. 

Wittgenstein  poses  a  number  of  ques¬ 
tions  that  the  introduction  of  the  idea  of  a 
generalised  schema  to  serve  as  the  basis  for 
a  category  poses.  Firstly,  there  is  the  ques¬ 
tion  of  the  form  that  a  generalisation  should 
take:  i.c.  what  shape  should  a  generalised  leaf 
be?  Linked  to  this  is  the  question  of  the  use 
of  schemas.  Even  when  we  can  answer  the 
first  question  -  how  we  say  generate  a  gener¬ 
alised  temperature  for  ice-cream  -  we  are  still 
left  with  the  related  question  of  how  such  a 
generalisation  is  to  be  used.  Which  particu¬ 
lar  aspects  of  the  schema  are  general,  and 
which  are  not  (we  might  rephrase  this  ques¬ 
tion  as  asking  which  parts  of  the  schema  rep¬ 
resent  ‘the  generalised  concept*,  and  which 
are  implementational  details  of  the  represen¬ 
tation  of  this  generalisation),  and  how  in  use 
are  we  supposed  to  know  which  is  which.  Is 
the  generalised  green  shape  a  schema  for 
green  or  a  schema  for  generalised  shape. 
Which  raises  the  further  question:  provided 
one  could  generate  answers  to  these  very 
challenging  questions,  what  is  supposed  to 
be  intrinsic  to  such  a  schema  that  would  cause 
it  to  be  used  differently  to  an  example  of  that 
which  it  was  supposed  to  be  a  generalisation 
of?  In  the  Philosophical  Investigations  Wit¬ 
tgenstein  makes  quite  clear  his  belief  that  no 
satisfactory  answers  to  these  questions  can 
be  provided.  Thus  he  does  not  advocate  sche¬ 
mas  as  a  theory  of  category  representation  (as 
argued  by  John  son -Laird,  1983),  but  rather 
he  seeks  to  demonstrate  that  schemas  alone 
cannot  provide  an  account  of  how  concepts 
are  represented 


'  Quine  (I960),  makes  a  similar  point  In  his  famous 
gavagai  discussion 
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Wittgenstein  and  Jamily  resemblances* 

We  can  state  the  broad  outline  of  Wittgen¬ 
stein’s  arguments  as  follows: 

1 .  That  categories  have  no  necessary  or  suffi¬ 
cient  defining  characteristics:  rather  that 
kinships  “family  resemblances”  can  be 
traced  across  categories  (§65-7) 

2.  That  these  category  spaces  are  unbounded 
-  i.e.  there  are  no  boundaries  to  the  space 
across  which  “family  resemblances”  can  be 
traced  (§68,  69,  70,  71,  73) 

3 .  That  learning  a  category  such  as  game  does 
not  involve  extracting  an  essence  or  sche¬ 
ma  from  instances.  (§71-83)  Rather,  this 
process  involves  learning  examples  (in¬ 
stances)  and  appropriate  ways  of  using 
these  examples  (§69,71,  73,  81,  82) 

These  arguments,  as  examined  so  far,  do 

not  advocate  a  particular  view  of  concepts  and 
categories  -  what  has  become  known  loosely 
as  ‘family  resemblance  theory’  -  but  rather  they 
represent  a  thorough  attempt  to  elucidate  the 
deep  problems  inherent  in  trying  to  account  for 
concepts  and  categorisation.  To  Wittgenstein, 
the  problems  involved  in  explaining  how  cate¬ 
gories  are  defined  stem  not  from  the  phenome¬ 
non  under  examination,  but  the  way  this  phe¬ 
nomenon  has  traditionally  been  defined  (hence, 
perhaps,  the  famous  ‘don’t  think,  but  look!’). 
If  we  ‘think’  -  i.e.  if  we  assume  that  the  exist¬ 
ence  of  things  called  games  entails  the  exist¬ 
ence  of,  say  a  central  schema  (defined  in  some 
as  yet  to  be  determined  way)  in  virtue  of  which 
the  things  can  be  considered  games  -  we  do  not 
explore  categorisation:  we  merely  predetermine 
the  explanations  we  can  formulate. 

Empirical  support 

Each  of  the  main  claims  Wittgenstein 
makes  are,  we  think,  amply  supported  in  the 
categorisation  literature  (for  a  full  review  of 
this,  see  Ramscar  and  Hahn  (forthcoming). 

1.  Necessary  and  sufficient  conditions. 
Wittgenstein’s  first  argument  attacks  the  defi¬ 
nitional  or  “Classical”’  view  of  concepts  (Smith 


and  Medin,  1981):  this  holds  that  concepts  pos¬ 
sess  definitions  specifying  features  necessary 
and  sufficient  for  the  concept.  This  definition 
is  the  summary  description  of  the  entire  class 
used  in  every  instance  of  categorisation,  which 
proceeds  simply  by  checking  for  the  presence 
of  these  features  in  the  entity  in  question.  This 
view  is  commonly  supplemented  by  the  “nest¬ 
ing  assumption”  that  a  subordinate  concept 
(e.g..  robin)  contains  nested  within  in  it  the  de¬ 
fining  features  of  the  super-ordinate  (bird). 

However,  the  definitional  view  seems  in¬ 
adequate  as  a  theory  when  transferred  from  arti¬ 
ficial  concepts  in  controlled  experiments  to  our 
eveiyday  concepts  (i.e.  the  concepts  for  which 
we  typically  have  words).  Of  the  difficulties 
faced  here,  the  most  serious  one  is  that  almost 
all  everyday  concepts  appear  to  be  indefinable 
(Fodor,  1981).  It  simply  does  not  seem  possible 
to  formulate  necessary  and  sufficient  conditions 
for  being,  for  example,  a  chair,  or  a  window,  or 
a  smile;  illustrated  by  the  fact  that  dictionary 
“definitions”  of  almost  all  terms  are  not  really 
definitions  at  all.  They  do  not  provide  necessary 
and  sufficient  conditions  for  category  member¬ 
ship  -  instead  they  typically  do  no  more  than 
provide  some  relevant  information  about  cate¬ 
gory  members,  which  may  help  the  dictionary 
user  identify  which  concept  in  intended.  Further 
evidence  against  the  definitional  view  comes 
from  examining  the  boundaries  of  natural  lan¬ 
guage  categories.  The  definitional  view  implies 
that  these  are  sharp,  cleanly  separating  instanc¬ 
es  from  non-instances.  But,  as  Wittgenstein 
claimed,  this  turns  out  not  to  be  the  case. 

2.  Boundaries.  In  1 949,  Black  provided  the 
following  thought  experiment  to  illustrate  that 
category  boundaries  might  be  vague:  on  is  to 
imagine  a  series  of  ‘chairs’  differing  in  quality 
by  least  noticeable  amounts.  This  can  give  rise 
to  an  ordered  sequence  which  moves  from  a 
Chippendale  chair  on  the  one  end  to  a  small 
nondescript  lump  of  wood  at  the  other  end.  A 
‘normal’  observer,  argues  Black,  should  find  it 
extremely  difficult  to  point  to  the  dividing  line 
between  ‘chairs’  and  ‘non-chairs’  along  this 
continuum,  which  illustrates  a  different  source 
for  category  vagueness.  (The  difficulties  posed 
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by  continua  were  already  recognized  in  the 
soritei?  (heap)  and  and  phalakro  (bald  man)  par¬ 
adoxes,  which  originate  with  the  Megarian  phi¬ 
losophers  in  the  early  4th  century,  Barnes, 
1979,)  Black  makes  it  clear  that  this  uncertain¬ 
ty  over  category  boundaries  can  be  generated 
for  any  term  whose  application  requires  the  use 
of  a  sense,  that  is  to  say  all  ‘materiaV  terms. 

Quine  (1960)  points  out  that  indetermina¬ 
cy  can  arise  not  only  because  the  category 
boundary  is  vague  (a  phenomenon  generally 
referred  to  as  *fuzziness*)  but  also  because  the 
boundaries  of  an  entity  can  be  vague.  To  illus¬ 
trate  with  his  example  of  ‘mountain*  which  is 
“vague  on  the  score  of  how  much  terrain  to  reck¬ 
on  into  each  of  the  indisputable  mountains,  and 
it  is  vague  on  the  score  of  what  lesser  eminenc¬ 
es  count  as  mountains**  (Quine,  1960,  p.  126) 

A  third  source  of  uncertainty  over  bound¬ 
aries  has  been  identified  by  Lakoff  (1987b). 
Even  when  concepts  do  appear  to  have  defini¬ 
tions,  these  definitions  generally  hold  only  with 
respect  to  a  range  of  ‘background  assumptions’. 
Varying  these  assumptions  immediately  pro¬ 
duces  unclear  or  borderline  cases: 

*'The  noun  bachelor  can  he  defined  as  an 
unmarried  adult  man,  but  the  noun  clearly  exists 
as  a  motivated  device  for  categorizing  people  only 
in  the  context  of  a  human  society  in  which  certain 
expectations  about  marriage  and  marriageable 
age  obtain.  Male  participants  in  long-term  un¬ 
married  couplings  would  not  ordinarily  be  de¬ 
scribed  as  bachelors;  a  boy  abandoned  in  the  jun¬ 
gle  and  grown  to  maturity  away  from  contact  with 
human  society  would  not  be  called  a  bachelor  '* 
(Fillmore,  quoted  in  Lakoff,  1987b) 

Background  factors,  such  as  the  social  con¬ 
ventions  concerning  marriage,  will,  in  general, 
hold  to  varying  degrees.  Presumably  the  defi¬ 
nition  of  bachelor  can  meaningfully  be  applied 
if  the  background  conditions  are  sufficiently 
similar  to  the  conventions  concerning  marriage 
current  in  the  West. 

Alongside  such  arguments,  direct  empiri¬ 
cal  evidence  that  (many)  natural  language  cat¬ 
egories  do  not  have  clear  boundaries  was  accu¬ 
mulated  in  the  1970*s.  The  first  studies  we 
know  of  were  conducted  by  the  linguist  Will¬ 


iam  Labov,  summarised  in  I^abov  (1973).  His 
empirical  work  focuses  on  cup-like  containers, 
examining  the  variability  inherent  in  the  use  of 
terms  such  as  cup,  bowl,  mug  etc.,  between  sub¬ 
jects  and  between  contexts.  Labov’s  interest 
was  primarily  in  formalising  the  variability 
found,  thus  his  results  arc  not  presented  with 
the  detail  experimental  psychologists  might 
want.  This  gap  is  readily  filled  by  McCloskey 
and  Glucksberg  (1978)  who  presented  a  study 
of  540  exemplar-category  pairs  (e.g.,  apple- 
fruit)  which  revealed  not  only  substantial  be¬ 
tween  and  within  subject  disagreement  over 
category  membership  (the  latter  measured  over 
successive  test-sessions)  but  also  show'ed  lev¬ 
els  of  disagreement  to  correlate  with  indepen¬ 
dently  derived  typicality  ratings. 

3.  Essences  versus  examples.  Wittgen- 
stein*s  final  point  rejects  the  idea  of  some  ab¬ 
stracted  schema  in  preference  for  an  account 
based  on  previously  encountered  examples. 
Whilst,  as  Komatsu  (1992)  notes,  the  vast  ma¬ 
jority  of  experimental  results  do  not  directly 
indicate  anything  about  conceptual  representa¬ 
tion:  separating  form,  content  and  the  process¬ 
es  acting  on  concepts  is  an  invidious  business 
(best  illustrated  by  Wittgcnstein*s  remarks  on 
schemas  above)  the  issue  of  whether  or  not  a 
particular  learning  process  involves  the  abstrac¬ 
tion  some  core  essence  -  be  it  a  schema,  a  theo- 
ly  or  a  prototype  -  or  not  has  been  central  to 
experimental  psychology  in  the  last  decades  and 
has  been  pursued  not  only  in  concept  learning 
tasks,  but  also  in  related  domains  such  as  Arti¬ 
ficial  Grammar  Learning  (Shanks  and  St  John, 
1994).  Controversy  has  raged  not  only  over 
actual  empirical  evidence  for  or  against  ab.strac- 
tion,  but  also  about  the  very  criteria  on  which  a 
distinction  could  conceptually  and  empirically 
based.* 

*  Barsalou  (1990)  ha<;  argued  that  exemplar  Ktorage 
and  abstraction  in  category  representation  arc  Impossible  to 
distinguish  In  principle.  This  position,  based  on  a  highly  id¬ 
iosyncratic  notion  of  abstraction.  Is  overly  pessimistic.  Carc- 
ftil  evaluation  of  the  many  criteria  that  have  been  put  forth, 
particularly  In  order  to  distinguish  between  processes  based 
on  rules  and  processes  based  on  exemplar  similarity,  reveals 
that  many  have  been  overestimated  in  their  power  to  cleanly 
distinguish  between  the  two  (Hahn  and  Chatcr.  I99R) 
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From  an  experimental  perspective,  Hahn 
and  Chater  (1998)  argue,  a  compelling  way  to 
address  this  issue  is  through  model  compari¬ 
sons  of  fully  specified  cognitive  models.  Take 
for  example,  the  evidence  regarding  prototypes: 
evidence  for  prototypes  in  natural  language 
categories  has  been  sought  from  a  variety  of 
sources.  Classic  are  those  studies  which  identi¬ 
fied  a  variety  of  so-called  “prototype  effects”; 
all  of  these  involve  some  form  of  differential 
reaction  to  central  or  typical  members  of  a  cat¬ 
egory  such  as  differences  in  typicality  ratings, 
faster  reaction  times  in  speeded  classification 
tasks  or  differential  retention  in  memory  rela¬ 
tive  to  other  items  (see  e.g.  Rosch,  Simpson  and 
Miller,  1976;  Posner  and  Keele,  1968;  Posner 
and  Keele,  1970).  However,  such  effects  do  not 
unequivocally  indicate  mental  representations 
of  concepts  in  terms  of  prototypes  (Lakoff, 
1987b).  Rather  such  effects  might  arise  from 
cognitive  representations  and  processes  which 
make  no  use  of  representations  of  prototypes 
or  central  tendencies  as  such. 

This  is  made  clear  by  comparative  mod¬ 
elfitting  of  fully  specified  process  models 
(though  this  tends  to  come  at  the  expense  of 
artificial  stimulus  domains).  The  categorisation 
literature  has  accumulated  a  wealth  of  studies 
in  which  model  comparisons  between  exem¬ 
plar  models  which  simply  store  all  encountered 
instances  in  memory,  and  prototype  models 
which  abstract  a  central  tendency  have  consis¬ 
tently  gone  in  favour  of  exemplar  models:  ex¬ 
emplar  models  have  yielded  quantitative  fits 
superior  to  the  prototype  models  tested  and  ac¬ 
counted  for  a  wide  range  of  phenomena  tradi¬ 
tionally  associated  with  prototypes  such  as  the 
instability  of  instance  retrieval  and  typicality 
judgements;  the  levels  of  specificity  at  which 
concepts  are  encoded;  sensitivity  to  correlations 
amongst  category  instances;  and  the  way  accu¬ 
racy  in  classification  tasks  increases  with  cate 
gory  size  (Nosofsky,  1986, 1987, 1988b,  1989, 
1991b,  Nosofsky,  Clark  and  Chin,  1989,  Shin 
and  Nosofsky,  1992;  Lamberts,  1996).  More¬ 
over,  as  Komatsu  (1992)  notes,  if  one  assumes 
that  individuals  only  retrieve  a  subset  of  these 
stored  exemplars  on  any  given  occasion,  but 


are  inclined  to  regard  that  subset  as  exhaustive 
(Nickerson,  1981),  then  the  an  exemplar  based 
approach  may  also  be  able  to  begin  to  explain 
why  it  is  that  people  believe  that  categories  have 
essences  and  boundaries.^ 

Similarly,  those  few  empirical  studies 
have  directly  addressed  the  assumptions  be¬ 
hind  core  essences  -  whether  as  schemas  or 
theories  -  have  found  little  or  no  support  for 
the  idea  that  essences  are  extracted  in  cate¬ 
gory  learning.  Malt  (1994)  found  the  assump¬ 
tion  (Putnam,  1975)  that  H2O  is  the  essence 
of  water  did  not  stand  up  to  empirical  scruti¬ 
ny,  and  that  judgements  of  the  amount  of 
H20  in  a  liquid  were  very  poor  predictors  of 
whether  it  was  water  or  not.  In  another  study, 
Ramscar,  Darrington,  Pain  and  Lee  (1998) 
used  differences  in  the  recall  characteristics 
of  surface  and  structural  aspects  of  represen¬ 
tations  (Centner,  Ratterman  arid  Forbus, 
1993)  to  show  that  subjects  could  classify 
items  together  under  a  category  name,  and 
carry  out  recall  tasks  with  category  members 
grouped  by  that  name,  without  extracting  a 
category  schema  or  essence;  Ramscar  et  afs 
subjects  appeared  to  have  stored  only  exem¬ 
plars  in  their  category  encoding. 

In  summary,  at  present  at  least,  there  is  no 
clear  evidence  in  the  literature  for  abstraction 
in  concepts  acquisition,  whilst  there  is  consid¬ 
erable  evidence  which  can  be  marshalled  sup¬ 
port  of  some  kind  exemplar  based  account . 

WHITHER  TWO  PROCESSES? 

Like  the  empirical  finding  we  present 
above,  Wittgenstein’s  arguments  bear  down  on 
any  all-encompassing  view  of  category  struc¬ 
ture.  Together,  the  two  appear  to  effectively 
explode  the  idea  of  the  category  as  a  unitary 
theoretical  instrument:  how  likely  is  it  that,  even 
if  categories  aren’t  defining  features,  shared 
essences  or  some  other  common  thread  running 
through,  that  there  is  a  fundamental  unity  in  all 
categories?  That  clear  cut  members  all  have 
higher  within  categoiy  similarity  than  between 
category  similarity  or  that  all  are  based  on  par¬ 
tial  theories,  and  so  on? 
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We  argued  earlier  that  in  order  for  the  stan¬ 
dard  contrast  definition  of  analogy  to  do  its 
work,  an  account  of  categorisation,  distinct 
from  that  contrast  definition,  was  necessary.  As 
this  brief  survey  shows,  no  such  account  is 
available,  nor,  does  it  seem  likely  that  any  an¬ 
swers  to  Wittgenstein’s  deep  questions  regard¬ 
ing  any  ‘straightforward’  account  of  categori¬ 
sation  will  be  forthcoming. 

Furthermore,  we  have  carried  out  a  num¬ 
ber  of  studies  which  directly  explore  the  con¬ 
trast  definition  from  the  opposite  direction, 
examining  the  properties  typically  used  to  sep¬ 
arate  analogy  from  categorisation.  Ramscar  and 
Pain  (1996),  showed  that  subjects  would  cate¬ 
gorise  Centner,  Ratterman  and  Forbus’s  ( 1 993) 
classic  analogy  materials  using  exactly  the  same 
process  that  they  used  to  determine  analogies 
between  them.  Darrington,  Lingstadt  and  Ram¬ 
scar  (1998)  showed  that  the  same  process  - 
structure  mapping,  typically  considered  the  pre¬ 
serve  of  analogy  -  could  cause  subjects  to  over¬ 
ride  supposedly  ecological  categories  in  sort¬ 
ing  tasks,  with  participants  preferring  groupings 
between  pots  and  walls,  and  walls  and  pans  to 
pots  and  pans  and  walls  alone.  These  studies 
can  be  added  to  other  theoretical  and  empirical 
evidence  against  a  two-process  account  of  lit¬ 
eral  (categorical)  versus  non-literal  (analogical 
or  metaphorical)  reasoning,  such  as  Hoffman 
and  Kemper’s  (1987)  review  of  a  number  of 
reaction  time  studies  which  also  demonstrates 
the  paucity  of  the  evidence  for  the  widely  held 


*  One  other  contender  in  current  debate  about  con¬ 
ceptual  structure  is  the  so-called  theory-based  view  (Mur¬ 
phy  &  Mcdin,  1985;  Mcdin  &  Ortony,  1989).  The  theory- 
based  view  is  defined  primarily  in  contrast  to  any  account, 
prototype-  or  exemplar-based,  which  seeks  to  ground  real 
world  categories  in  terms  of  perceptual  similarity.  It  empha¬ 
sises  the  role  of  background  knowledge  or  **chcorics'*  in  our 
everyday  classification,  in  order  to  explain,  for  Instance,  the 
fact  that,  despite  strong  perceptual  similarities,  we  do  not 
classify  bats  as  birds.  Due  to  its  lack  of  explicitness  the 
theory-based  view  is  not  that  easy  to  align  with  Witt  gen 
stein’s  claims.  Given  the  problems  Inherent  in  definitional 
accounts  of  conceptual  structure  (sec  above),  one  must  as¬ 
sume  that  “theories”  are  not  complete,  i.c.  they  allow  de¬ 
duction  of  classification  decisions,  but  are  only  "partial”,  in 
that  they  form  one  component  of  a  complex,  non-deductive 
overall  procc.ss  (Hahn  &  Chatcr,  1997).  This  overall  process 


belief  that  literal  (intra-categoriciil)  meanings 
are  processed  faster  th.in  metaphorical  (inter- 
categorical)  meanings  (as  well  as  the  consider¬ 
able  evidence  for  the  opposite  effect;  see  also 
Rdcanati,  1995,  Glucksburg  and  Keysar,  1990, 
Gibbs,  1984).  Theoretically,  at  least,  distin¬ 
guishing  analogy  from  categorisation  may  not 
be  the  simple  task  our  intuitions  -  and  the  liter¬ 
ature  -  might  have  us  believe. 

*  One  defence,  in  the  light  of  these  argu¬ 
ments,  might  be  an  appeal  to  categories  ground¬ 
ed  in  ecology:  the  difference  between  analogy 
and  categorisation  is  that  categories  really  do  - 
in  some  way  -  reflect  the  underlying  structure 
of  the  world  in  a  way  that  analogies  do  not. 
Whilst  researchers  in  mainstream  categorisa¬ 
tion  research  arc  at  often  pains  to  disavow  meta¬ 
physical  realism  (c.f.  Murphy,  1996)  in  prac¬ 
tice,  the  very  kinds  of  categories  they  choose 
to  examine,  and  the  attitude  they  adopt  towards 
them  in  discussing  between -category'  compari¬ 
sons,  tempers  the  impact  of  these  protests. 

In  disagreeing  with  Wittgenstein’s  position 
regarding  categorisation,  Medin  and  Ortony 
(1989)  suggest  that  if  people  really  think  about 
the  fact  that  whales  arc  mammals  not  fi.sh,  they 
will  see  that  with  respect  to  some  important, 
although  less  accessible  property  or  properties 
whales  are  similar  to  other  mammals.  “If  one 
cannot  appeal  to  hidden  properties,  it  is  diffi¬ 
cult  to  explain  the  fact  that  people  might  rec¬ 
ognise  such  similarities...  there  might  be  a  price 
to  pay  for  looking  rather  than  thinking.”  (Mc- 

is  not  generally  spelled  out  by  ndvoeates  of  the  theory  -based 
view.  The  simple  claim  then  that  "partial  theories”  or  back¬ 
ground  knowledge  arc  relevant  to  categorisation  need  not 
conflict  with  Wittgenstein’s  arguments  TlKre  is  no  state¬ 
ment  about  boundedness,  nor  is  there  a  claim  of  definition - 
a1  features  Though  the  theory -based  view  does  suggest  th.nt 
learning  and  understanding  a  category  also  involves  acquir¬ 
ing  appropriate  background  knowledge,  this  does  not  di¬ 
rectly  contradict  the  role  of  examples  in  acquisition  and  use. 
but  merely  suggests  an  additional  factor  Tliis  still  leaves  a 
problem  regarding  partial  theories,  i.c.  how  partial  does  a 
theory  have  to  be  to  not  be  stating  an  “essence”?  Given  that 
the  theory-based  view  has  done  little  to  provide  full  accounts 
of  any  categories,  no  definate  answer  can  be  given  to  this 
question  here  To  the  extent  though,  that  too  much  faith  is 
invested  in  the  powrr  of  theories,  another  look  at  Wittgen 
stein’s  arguments  and  examples  might  be  sobering 
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din  and  Ortony,  1989,  pp  179  - 1 80).  The  prob- 
leni  is  that  it  is  just  this  fact,  that  is  the  point  of 
any  investigation  of  human  categorisation  (c.f. 
Malt,  1994).  In  ‘Ontology*,  (Moby  Dick, 
Melville,  1851),  the  central  character,  Ishmael, 
examines  all  of  the  reasons  put  forward  by  Lin¬ 
naeus  for  classifying  whales  as  mammals. 

I  submitted  all  [these]  to  my  friends  Sime¬ 
on  Macey  and  Charlie  Coffin,  of  Nantucket, 
both  messmates  of  mine  in  a  certain  voyage, 
and  they  united  in  the  opinion  that  the  reasons 
set  forth  were  altogether  insufficient.  Charlie 
profanely  hinted  that  they  were  humbug. 

Be  it  known  that,  waiving  all  argument,  I 
take  the  good  old  fashioned  ground  that  the  whale 
is  a  fish,  and  call  upon  holy  Jonah  to  back  me 

As  Wittgenstein  famously  remarked,  our 
talk  of  process  and  states  is  just  what  commits 
us  to  a  particular  way  of  looking  at  a  matter, 
(Wittgenstein  1953,  pl02).  Choosing  what  is 
to  count  as  facts  when  it  comes  to  categorisa¬ 
tion  is  a  powerful  determinant  of  the  picture  of 
the  process  one  will  uncover.  And  taking  on 
board  a  different  set  of  facts  can  radically  alter 
any  such  picture.  All  classification  systems  are 
human  constructs,  and  our  immersion  in  one 
such  system  shouldn’t  blind  us  to  alternatives. 
Similarly,  it  is  important  to  be  aware  of  the  so¬ 
cial  dimensions  of  categorisation,  and  the  way 
collective  and  individual  categories  can  differ; 
it  may  be-in  the  study  of  the  cognitive  pro¬ 
cesses  of  categorisation  -  that  individual  facts 
might  reveal  more  than  collective  ones. 

If  we  broaden  our  view,  we  see  that  eco¬ 
logically,  the  distinction  between  categorisa¬ 
tion  and  analogy  is  a  recent  one:  the  concep¬ 
tual  revolution  begun  by  Linnaeus  represents 
the  overthrow  by  a  system  based  on  heredity 
of  a  previous  system  based  far  more  on  analo¬ 
gy.  As  Thomas  (1984)  argues  in  his  detailed 
account  of  changes  in  natural  kind  categories 
in  England  in  the  period  1 500  - 1 800,  for  much 
of  the  early  modem  period,  ‘the  universal  be¬ 
lief  in  analogy’  resulted  in  much  of  the  natu¬ 
ral  world  being  categorised  and  understood  by 
analogy  with  human  social  structures.  Bees 
had  Princes,  Potentates,  Kingdoms  and  Do¬ 
minions  (Warder,  1716;  Rusden,  1679,  quot¬ 


ed  in  Thomas,  1984  p.  62);  they  were  ruled 
over  by  ‘a  fair  and  stately  bee,  having  a  ma¬ 
jestic  gait  and  aspect’  (Levett,  1634,  quoted 
in  Thomas,  1984,  p.  62).  Cranes  followed  a 
captain;  Rooks  had  a  parliament;  Storks  and 
Ants  and  Beavers  were  avowed  republicans. 
As  Thomas  notes,  this  picture  of  the  natural 
world  fed  back  recursively  into  concepts  of 
human  society:  King  Henry  VII  once  ordered 
the  execution  of  all  mastiffs,  after  they  had 
baited  a  lion,  ‘being  deeply  displeased  ...  that 
an  ill-favoured  rascal  cur  should  with  such  vi¬ 
olent  villainy  assault  the  valiant  lion,  king  of 
all  beasts’  (Caius,  1576,  quoted  in  Thomas, 
1984,  p.  60)). 

The  important  issue  here  is  not  whether  the 
Linnaean  way  of  construing  the  world  is  right,  or 
whether  other  ‘pre- Linnaean’  conceptual  schemes 
are  wrong;  nor  is  it  a  question  of  finding  an  anal¬ 
ysis  that  will  answer  these  questions.  All  that  dif¬ 
ferent  conceptual  schemes  such  as  these  reflect  is 
the  differing  attitudes  to  pre-theoretical  ideas  of 
categorisation  and  analogy  that  they  embody  (and, 
as  Lakoff,  1987a,  illustrates,  the  Linnaean  revo¬ 
lution  may  be  less  complete  than  we  generally 
believe).  Our  claim  is  that  if  we  wish  to  explain 
the  cognitive  processes  that  actually  underpin 
analogy  and  categorisation,  then  it  is  just  these 
pre-theoretical  intuitions  we  should  question,  and, 
for  certain  purposes,  abandon. 

The  consequence  of  our  investigation,  of 
both  Wittgenstein’s  position  and  the  supporting 
evidence,  is  a  claim  analogous  to  that  which  has 
been  made  for  the  related  process  of  processes 
that  determine  literal  and  metaphoric  meaning. 
Gibbs  (1984)  notes  that  the  claim  that  there  is  no 
principled  distinction  between  literal  and  meta¬ 
phoric  meaning  leaves  one  important  question 
unanswered:  how  can  we  explain  why  people  can 
often  judge  a  sentence  to  be  literal  or  metaphor¬ 
ic?  What  lies  behind  the  intuition  that  “an  orange 
crate  is  an  orange  crate  is  an  orange  crate”?  Whilst 
Gibbs  acknowledges  that  this  intuition  needs  ex¬ 
ploring,  he  asks  “does  it  indicate  that  listeners 
process  [our  emphasis]  so  called  literal  and  met¬ 
aphoric  utterances  differently?”  (p.  296).  Rumel- 
hart  (1979)  makes  the  point  that  “the  classifica¬ 
tion  of  an  utterance  as  to  whether  it  involves  liter- 
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a1  or  metaphoric  meanings  is  analogous  to  our 
judgement  as  to  whether  a  bit  of  language  is  for¬ 
mal  or  informal.  It  is  a  judgement  that  can  be  reli¬ 
ably  made,  but  not  one  which  signals  fundamen¬ 
tally  different  comprehension  processes"  (p.  79). 
Gibbs  argues  that  one  reason  why  some  sentenc¬ 
es  seem  so  literal  is  that  listeners  are  influenced 
by  the  inteipretative  context  in  which  such  judge¬ 
ments  are  made:  people  judge  a  sentence  as  hav¬ 
ing  literal  meaning  because  it  is  isomoiphic  with 
the  situation  in  which  the  sentence  is  interpreted 
(Fish,  1980).  However,  it  doesn’t  follow  from  this 
that  the  literal  meanings  of  sentences  can  be 
uniquely  determined,  as  our  understandings  of 
situations  always  influence  our  understandings  of 
sentences.  Says  Gibbs,  ‘To  speak  of  a  sentence’s 
literal  meaning  is  to  already  have  read  it  in  the 
light  of  some  purpose,  to  have  engaged  in  an  in¬ 
terpretation.  >^at  often  appears  to  be  the  literal 
meaning  of  a  sentence  is  just  an  occasion-specif¬ 
ic  meaning  where  context  is  so  widely  shared  that 
there  doesn’t  seem  to  be  a  context  at  all."  Gibbs, 
1984,  p.  296;  As  forjudging  sentences  are  literal, 
we  claim,  so  forjudging  whether  whales  are  mam¬ 
mals  or  fish;  or,  for  that  matter,  whether  our  pic¬ 
nic  is  ‘on  the  orange-crate*  or  ‘on  the  table’. 

It  may  be  that  the  best  accounts  of  cate¬ 
gorisation  will  also  incorporate  an  account  of 
analogy,  and  explain  both  in  terms  of  a  single 
cognitive  process.  Some  of  the  more  important 
findings  from  existing  analogy  research  are  the 
important  role  that  representational  structure 
has  to  play  in  similarity  judgements,  and  the 
differing  roles  that  surface  and  structural  fea¬ 
tures  play  in  recall.  It  may  be  that  incorporat¬ 
ing  a  dimension  of  structural  similarity  into  the 
similarity  space  mapped  in  an  exemplar  model 
of  categorisation  might  also  enable  the  model¬ 
ling  of  analogy  and  superficial  similarity,  with¬ 
out  recourse  to  multiple  processes.  On  such  a 
model,  strong  similarity  across  all  dimensions 
(including  both  surface  and  structural  similari¬ 
ties)  might  betoken  strong  categorical  similari¬ 
ty  -  with,  perhaps,  the  strongest  similarities  oc¬ 
curring  in  basic  level  categories  -  whereas  strong 
mappings  on  only  a  subset  of  similarity  dimen¬ 
sions  would  undeipin  analogical  (or  superficial, 
or  metaphorical)  similarity. 


This  would  still  leave  us  with  the  prob¬ 
lem  of  explaining  peoples*  intuitions  about 
analogy  and  categorisation.  However,  as  we 
noted  earlier,  if  one  assumes  that  individuals 
only  retrieve  a  subset  of  stored  exemplars  dur¬ 
ing  any  given  similarity  computation  episode, 
and  that  they  may  be  inclined  to  regard  that 
subset  as  exhaustive  (mimicking  Gibb’s,  1 994, 
point  made  earlier:  all  judgements  are  contex¬ 
tual,  even  if  it  doesn’t  feel  like  they  are;  the 
subset  of  exemplars  recalled  simply  matches 
the  context  of  the  categorisation  judgement  to 
be  made)  then  an  exemplar  based  approach 
might  be  able  to  begin  to  explain  why  it  is  that 
people  believe  that  categories  have  essences 
and  boundaries.  To  return  to  French’s  (1995) 
suggestion  that  an  orange-crate,  when  covered 
with  a  cloth  and  laid  out  with  a  picnic,  might 
really  be  a  table:  a  model  such  as  this  might 
be  able  to  explain  more  than  why  it  is  that  ‘an 
orange  crate  is  an  orange  crate,  can  be  a  ta¬ 
ble’.  If  we  could  show  how  ‘ordinary*  cate¬ 
gorical  judgements  of  table  arc  just  those  oc¬ 
casion-specific  judgements  where  context  is 
so  widely  shared  that  there  doesn*t  seem  to  be 
any  context  at  all,  we  might  also  be  able  to 
offer  an  explanation  of  why  it  is  that  some 
people  find  this  idea  so  very  counter-intuitive. 
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ABSTRACT 

Analogical  inference  depends  on  system- . 
atic  substitution  of  the  components  of  com¬ 
positional  structures.  Simple  systematic  sub¬ 
stitution  has  been  achieved  in  a  number  of  con- 
nectionist  systems  that  support  binding  (the 
ability  to  create  connectionist  representations 
of  the  combination  of  component  representa¬ 
tions).  These  systems  have  used  two  types  of 
binding  operators  (generically  renamed  here 
as  bindO  and  bundleO)  implemented  in  vari¬ 
ous  ways.  This  paper  introduces  a  novel  im¬ 
plementation  of  the  bindO  operator.  This  im¬ 
plementation  is  interesting  because  it  is  re¬ 
moves  some  of  the  complexities  of  other  im¬ 
plementations,  can  be  efficiently  implement¬ 
ed,  and  allows  easy  specification  of  queries  in 
a  way  that  highlights  their  equivalence  to  an¬ 
alogical  mapping  problems. 

The  binding  operators  may  also  be  viewed 
as  representational  operators  because  they  are 
used  for  the  construction  of  complex,  compo¬ 
sitional  representations.  The  specific  imple¬ 
mentation  of  the  representation  operators  par¬ 
tially  constrains  the  representations  that  may 
be  constructed.  This  paper  shows  that  some 
binding  systems  are  unable  to  adequately  rep¬ 
resent  hierarchical  compositional  structures.  A 
novel  family  of  representational  operators 


(called  bfaidO)  is  introduced  to  allow  repre¬ 
sentation  of  nested  structures.  Other  potential 
uses  of  thebraidO  opdrators  are  also  explored. 

The  specific  implementation  of  the  repre¬ 
sentation  operators  does  not  completely  con¬ 
strain  the  representations  which  may  be  con¬ 
structed.  A  system  designer  rhust  also  choose  a 
representational  idiom  for  the  encoding  of  in¬ 
formation.  The  choice  of  representational  idi¬ 
om  will  further  constrain  the  relative  ease  of 
different  coghitive  operations.  The  most  com¬ 
monly  used  idiom  (based  on  frames  of  role/filler 
bindings)  limits  the  simultaneous  representa¬ 
tion  of  multiple  objects.  This  paper  proposes 
an  alternative  idiom  (also  based  on  frames)  to 
solve  this  problem; 

The  new  representational  idiom  highlights 
a  previously  unnoticed  problem  (which  exists 
in  other  connectionist  binding,  systems)  with 
maintaining  the  disjointness  of  roles  and  fill¬ 
ers.  This  problem  is  explored  and  several  solu¬ 
tion  approaches  discussed.  One  interesting  ap¬ 
proach  depends  On  a  generalisation  of  the  new¬ 
ly  introduced  braid()  Operator. 

The  new  representational  idiom  suggests  ' 
that  cognitiveoperations  of  bottom-up  and  top- 
down  object  recognition  should  be  relatively 
easy.  These  operations  depend  absolutely  on 
analogical  mapping  in  order  to  connect  disjoint 
representations  and  driye  perceptual  search. 
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We  propose  a  computational  model  for  an¬ 
alogical  problem  solving  which  especially 
adresses  the  influence  of  structural  character¬ 
istics  on  adaptation  and  learning  (Centner  , 
1983).  While  there  is  strong  empirical  evidence 
that  semantic  and  pragmatic  aspects  are  impor¬ 
tant  constraints  for  retrieval  of  source  problems 
as  well  as  for  analogical  mapping  (Hummel  & 
Holyoak,  1 997),  we  believe  it  worthwhile  to  fur¬ 
ther  investigate  structural  properties:  It  is  evi¬ 
dent  that  in  realistic  settings  source  problems 
are  usually  not  isomoiphical  to  target  problems. 
But  the  question  which  kind  of  structural  prop¬ 
erties  are  necessary  for  succesful  adaptation  is 
seldom  addressed  in  psychological  experiments 
(Hummel  et  al,  1997)  and  there  are  no  compu¬ 
tational  models  dealing  with  structure  mapping 
and  adaptation  and  learning  in  the  case  of  non 
isomorphical  problems.  For  example,  PUPS 
(Anderson  &  Thompson,  1989)  deals  with  ad¬ 
aptation  and  learning,  biit  only  for  problem  iso- 
morphs;  LISA  (Hummel  ct  al.,  1997)  deals  with 
.  not  isomorphical  problems,  but  gives  only  re¬ 
gard  to  analogical  access  and  mapping. 

Our  model  IPAL  was  developed  in  the  con¬ 
text  of  automatic  programming  (Schmid  and 
Wysotyki  1998).  But  we  believe,  that  it  also 
contains  useful  ideas  for  cognitive  modelling. 
Problems  as  well  as  problem  schemes  are  rep¬ 
resented  in  a  common  format,  namely  as  graphs 
or  trees.  Mapping  between  two  problems  (or  a 
current  problem  and  a  problem  scheme  already 
acquired)  is  done  by  means  of  a  tree-metric: 
The  similarity  between  two  structures  is  given 
by  the  weighted  number  of  operations  (substi¬ 
tution,  insertion  and  deletion  of  nodes  repre¬ 
senting  objects,  relations  or  functions)  needed 


to  transform  the  source  structure  into  the  tar¬ 
get.  Mapping  guides  retrieval  as  well  as  adap¬ 
tation.  If  two  structures  are  isomorphical,  they 
can  be  transformed  into  another  by  a  unique 
set  of  substitutions.  Otherwise,  the  source  so¬ 
lution  can  be  adapted  to  the  target  problem  by 
applying  the  operations  gained  by  the  mapping 
of  the  problem  descriptions  to  the  solution  of 
the  source  problem.  If  a  target  problem  could 
be  successfully  solved  by  adaptation  of  a 
source,  a  generalized  scheme,  which  covers  the 
common  structure  of  source  and  target,  is  con¬ 
structed.  The  target  problem  and  the  general¬ 
ized  scheme  are  committed  to  memory  with  the 
generalized  scheme  as  parent  to  source  and  tar¬ 
get.  Thereby,  a  hierarchical  memory  structure 
develops  while  the  system  gets  confronted  with 
new  problems. 

We  have  tested  IPAL  with  a  variety  of 
structural  relations  between  source  and  tar¬ 
get  pairs  and  obtained  the  following  results: 
If  source  and  target  are  isomorphical,  adap¬ 
tation  success  is  100%,  for  homomorphical 
structures  (mono-  or  epimorphical)  66%,  for 
problems  with  no  defined  structural  relation 
4%.  This  shows  that  there  have  to  be  charac¬ 
teristics  for  structural  relationships  not  cov¬ 
ered  by  the  concept  of  morphisms.  Our  next 
aim  therefore  is  to  identify  further  structural 
constraints  for  adaptation  success. 

Additionally  we  have  performed  two 
experiments  where  the  structural  similar¬ 
ity  between  source  and  target  was  system¬ 
atically  variied.  We  obtained  the  follow¬ 
ing  results:  (1)  people  are  able  to  adapt 
partial  isomorphic  problems  (i.c.  the 
source  structure  is  contained  completely 
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in  the  target  structure)  only  if  the  superfi¬ 
cial  similarity  between  source  and  tarpt 
is  high  (Keane  et  al. ,  1994);  (2)  given  high 
superficial  similarity,  partial  isomorphs 


can  be  adapted  succesfully  if  the  number 
of  nodes  of  the  common  structure  is  more 
or  equal  to  the  number  of  nodes  of  the 
(larger)  structure  of  the  target  problem. 
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We  propose  a  new  contextually  aware  uni¬ 
versal  paradigm  which  can  extend  or  replace 
formal  logic,  and  which  is  capable  of  support¬ 
ing  hierarchical  metastates  and  a  description  of 
the  development  of  life  and  consciousness 
through  evolutionary  computation. 

In  presupposing  a  coherent  universe,  we  ac¬ 
knowledge  the  correlation  of  its  constituent 
properties  and  processes,  and  accept  that  all  of 
its  regions  must  remain  communicative  to  sup¬ 
port  coherence.  Distinguishable  forms  then  ex¬ 
ist  through  the  actions  of  one  coherent  set  of 
processes.  Successful  survivalist  processing  of 
massive  amounts  of  real-time  data  by  living 
entities  necessitates  the  availability  of  simpli¬ 
fied  but  locally  representative  models  of  "real¬ 
ity"  which  are  couched  in  terms  familiar  to  the 
processor:  the  use  of  analogues. 

The  selection  of  favourable  analogues  follows 
the  same  criteria  and  suffers  from  the  same  diffi¬ 
culties  as  does  their  successful  linguistic  trans¬ 
mission.  We  can  integrate  these  two  processes  into 
a  single  format,  that  of  a  unified  hierarchical  sym¬ 
bolic  language  which  displays  only-partially-de- 
terministic  coupling  between  its  formally  repre¬ 
sented  parts.  "AQuARTUM"’  provides  a  frame¬ 
work  for  this  symbolic  language,  which  consists 
initially  of  only  a  single  symbol.  The  symbol  con¬ 
tains  just  enough  information  to  invite  questions 
as  to  its  significance,  without  presenting  sufficient 
detail  for  an  intelligently  inquisitive  "selector*'  to 
be  sure  of  the  correctness  of  an  initial  guess  as  to 
its  meaning.  The  nature  of  the  resulting  questions 
can  then  be  used  to  evaluate  the  context  into  which 
more  detailed  description  will  be  placed,  rather 
than  presupposing  unilaterally  a  "correct"  com- 
prehensional  context. 


Separate  analogues  emerge  from  "reality"  as 
structures  which  correspond  to  the  formulation 
of  “locally  sufficient”  approximating  metastat¬ 
ic  reprCvSentations  of  an  otherwise  partially  dis¬ 
ordered  or  chaotic  region  of  the  universal  phase 
space.  Consequently,  an  analogue  is  always  to 
some  extent  defective  in  Its  detail,  in  that  it  must 
of  necessity  exhibit  differences  from  its  "real" 
counterpart.  Internally,  for  an  "originating"  pro¬ 
cessor,  the  use  of  a  selected  analogue  is  relative¬ 
ly  simple,  given  a  good  memory  of  which  char¬ 
acteristics  have  been  selected  as,  or  determined 
to  be,  "correct"  analogous  details.  However,  the 
transfer  of  an  analogue  from  one  processor  to 
another  is  fraught  with  dangers.  The  major  diffi¬ 
culty  in  selecting  a  transferable  analogue  is  to 
match  the  "representative"  characteristics  recog¬ 
nised  by  its  creator  to  those  which  arc  interpret¬ 
ed  by  its  receptor.  For  example,  in  likening  the 
flow  of  "electrons"  through  a  network  of  wires 
and  switches,  to  the  early-morning  rush  of  com¬ 
muters  through  tunnels  and  barriers  in  accessing 
the  Metro,  we  should  not  assume  that  "electrons" 
carry  briefcases  with  them,  nor  that  first  of  all 
they  kiss  their  wives  goodbye  before  commenc¬ 
ing  the  journey. 

Communication  of  an  idea  from  one  pro¬ 
cessor  to  another  depends  on  an  equivalence  of 
both  of  their  logic  systems  andxhch  data  envi¬ 
ronments,  or  alternatively  on  a  successful  man¬ 
ner  of  evaluating  any  differences  between  these 
and  correcting  for  them.  This  always  necessi¬ 
tates  a  two-directional  process  where  ultimate- 
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ly  it  will  be  unimportant  which  of  the  two  pro¬ 
cessors  initiated  the  communication,  but  only 
whether  this  evaluation  and  correction  has  been 
successfully  carried  out.  The  implied  correspon¬ 
dence  to  inter-processor  cooperation  is  inher¬ 
ent  to  the  framework  provided  by  AQuARIUM. 

Ultimately,  in  a  coherent  universe,  all  an¬ 
alogues  of  all  "realities"  are  equivalent  when 
account  is  taken  of  their  associated  approxi¬ 
mations,  and  they  can  consequently  all  be  in¬ 
tegrated  into  a  descriptive  language  of  this 
kind.  The  maintenance  of  universal  universal 
coherence  requires  continuous  communication 


between  all  stable  metastatic  entities,  yet  the 
natural  presence  of  an  Einsteinian  communi¬ 
cation  restriction  eliminates  the  possibility  of 
instantaneous  direct  correlation  in  a  causally 
coherent  domain.  Formally  defined  metastates 
cannot  communicate  directly  with  each  oth¬ 
er,  and  any  communication  which  does  occur 
must  take  place  at  least  partially  through  the 
causal  chaos  represented  by  nonlocality.  The 
complete  range  of  possibilities  between  these 
two  extremes  can  initially  be  modeled  in 
AQuARIUM  by  a  modified  recursive  form  of 
Dempster-Schafer  probability. 
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An  account  of  analogical  thinking  must 
explain  structure  sensitivity  and  flexibility  in 
the  comparison  process  (Centner  &  Markman, 
1993;  Hummel  &  Holyoak,  1997).  Analogi¬ 
cal  mapping  is  widely  viewed  as  the  alignment 
of  structured  representations  to  maximize 
common  relational  structure.  The  process 
model  of  structure-mapping,  as  operational¬ 
ized  in  SME  (Falkenhainer,  Forbus  &  Cent¬ 
ner,  1989),  relies  on  matching  predicates  that 
are  identical  in  both  the  source  and  target.  In 
addition,  non-identical  matches  can  be  made 
when:  1)  systems  of  identity-matches  license 
correspondence  between  certain  non-identical 
elements,  and/or  2)  semantically  similar,  but 
non-identical,  predicates  are  candidates  to  be 
placed  in  analogical  correspondence.  We  sug¬ 
gest  a  process  of  re-representation  during  com¬ 
parison  by  which  semantic  content  can  be  de¬ 
composed,  integrated  or  abstracted  to  allow 
for  the  alignment  of  underlying  commonali¬ 
ties  between  base  and  target  (see  Centner  & 
Medina,  in  press). 

There  has  not  been  a  direct  experimental 
test  of  how  these  processes  occur  in  real  time. 
The  present  investigation  uses  a  methodologi¬ 
cal  paradigm  in  which  participants  make  on¬ 
line  judgments  about  the  analogical  relatedness 
of  pairs  of  structured  stimulus  items  that  vary 
in  their  similarity  relationships.  We  report  ac¬ 
curacy  and  RT  data  in  the  evaluation  of  analo¬ 
gies  that  reveal  systematic  differences  depend¬ 
ing  on  the  kind  and  degree  of  similarity  between 
items  being  compared.  Implications  of  these 


data  for  the  underlying  process  of  comparison 
are  considered. 
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Experimental  studies  on  analogy  making 
mainly  rely  on  a  ‘source-target*  paradigm  in 
which  a  source  situation  is  taught  to  the  partic¬ 
ipants  before  testing  their  behavior  within  the 
target  situation.  It  enables  to  control  the  knowl¬ 
edge  of  the  subjects  concerning  the  source;  and 
also  to  manipulate  the  source  in  order  to  study 
the  influence  of  the  manipulated  features.  This 
paradigm  can  also  be  directly  transposable  in 
teaching  situations  in  which  the  source  can  be 
taught  in  order  to  help  the  subject  understand 
the  target.  This  paradigm  also  reveals  some  lim¬ 
its.  Firstly,  it  is  difficult  to  control  to  which 
extent  previous  knowledge  intervenes  in  the 
process  of  building  a  representation  of  the 
source  and  of  the  target.  Some  interpretative 
effects  have  been  demonstrated  in  those  situa¬ 
tions  (Bassok,  Wu,  &  Olseth,  1995).  Secondly, 
this  paradigm  is  not  suitable  for  studying  the 
whole  range  of  the  analogies.  In  ecological  sit¬ 
uations,  spontaneous  analogies  usually  rely  on 
familiar  sources  which  can  hardly  be  taught 
within  an  experimental  session.  For  instance, 
children  take  their  knowledge  about  human 
beings  as  a  source  in  many  situations  (Inagaki 
&Hatano,  1991). 

Another  paradigm  can  be  used  to  study 
spontaneous  analogies  in  which  any  knowledge 
in  long  term  memory  may  be  a  potential  source: 
no  source  is  given  to  the  participant  and  his/her 
behavior  is  compared  with  the  one  predicted 


through  an  hypothesized  source.  This  paradigm 
allows  to  predict  and  explain  the  difficulties  met 
by  participants  in  a  wider  range  of  situations 
than  within  the  classical  paradigm. 

We  present  two  experiments  in  which  prob¬ 
lem  solving  situations  are  analyzed  as  relying 
on  analogies  with  familiar  sources. 

In  the  first  experiment,  children  who  started 
to  study  column  subtractions  without  borrow¬ 
ing  are  asked  to  solve  column  subtraction  with 
borrowing.  Their  mistakes  were  predicted 
through  the  reference  to  two  main  familiar  sourc¬ 
es:  subtracting  is  like  taking  a  part  from  a  whole, 
and  subtracting  is  like  covering  a  distance.  A 
model  was  built  on  the  basis  of  the  use  of  those 
analogies,  and  the  result  of  the  simulation  was 
compared  to  the  pattern  of  responses.  We  are 
able  to  simulate  83%  of  the  responses. 

In  a  second  experiment,  adults  are  asked  to 
solve  isomorphs  of  the  Tower  of  Hanoi  in  which 
they  have  to  move  or  to  change  the  size  of  ob¬ 
jects.  Difficulties  are  predicted  through  the  use 
of  two  sources  depending  on  the  isomorph: 
knowledge  about  taking  a  lift,  and  knowledge 
about  biological  growth.  We  show  that  the  diffi¬ 
culties  result  from  the  use  of  these  familiar  sourc¬ 
es.  Their  use  entails  additional  constraints  which 
lead  to  building  inadequate  problem-space. 

The  results  support  the  idea  that  analogies 
allow  the  learners  to  attribute  to  the  new  situa¬ 
tions  the  properties  of  well  known  situations.  The 
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interest  of  this  paradigm  is  that  it  allows  to  point 
out  the  nature  and  the  functions  of  the  familiar 
knowledge  implied  in  analogy  mechanism. 
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One  fundamental  question  in  cognitive 
psychology  is  whether  knowledge  construct¬ 
ed  during  the  analysis  of  examples  is  stored  in 
an  abstract  form  or  whether  it  is  kept  in  its 
full  form.  A  related  question  concerns  the  con¬ 
ditions  under  which  knowledge  can  be  used 
in  a  problem  solving  situation.  Can  an  exam¬ 
ple  be  understood  and  reused  to  solve  a  new 
problem  without  resorting  to  an  abstract  rep¬ 
resentation?  We  present  two  studies,  with  nov¬ 
ices  in  the  game  of  chess,  investigating  the 
existence  of  a  process  of  reasoning  by  analo¬ 
gy  that  does  not  require  the  mediation  of  an 
abstract  knowledge  structure.  In  the  first  ex¬ 


periment,  subjects  analyse  chess  problem  ex¬ 
amples  and  then  solve  similar  problems.  The 
results  showed  that  during  transfer,  subjects 
use  knowledge  that  has  a  very  low  degree  of 
abstraction:  they  only  succeed  on  problems 
similar  to  the  examples  when  they  are  percep¬ 
tually  close  (in  particular,  they  failed  when  we 
changed,  symmetricly,  the  chess  pieces  posi¬ 
tion  on  the  chessboard). 

Experiment  2  investigates  the  role  of  fail¬ 
ure  in  analogical  transfer.  From  the  results  it 
seems  that  attempting  to  solve  the  source  prob¬ 
lem,  and  encounter  failures,  is  a  determinant  in 
case-based  reasoning. 
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ABSTRACT 

The  present  study  investigated  whether  pre¬ 
school  children  recognize  numerical  equiva¬ 
lence  between  sets  of  objects  that  vary  in  simi¬ 
larity.  The  results  indicate  that  the  ability  to 
recognize  numerical  equivalence  for  varying 
object  sets  emerges  gradually  during  the  pre¬ 
school  period.  Verbal  counting  ability  is  linked 
to  success  on  some  but  not  all  comparisons. 

BACKGROUND 

On  the  face  of  it,  the  task  of  judging  nu¬ 
merical  equivalence  seems  much  like  judging 
similarity  along  any  other  dimension-*entities 
are  compared  and  a  common  attribute  or  rela¬ 
tion  is  identified.  Therefore,  one  might  expect 
children’s  numerical  equivalence  judgments  to 
develop  like  similarity  judgments  in  other  do¬ 
mains.  For  example,  the  effects  of  surface  sim¬ 
ilarity  on  children’s  comparisons  are  well-doc¬ 
umented  in  a  variety  of  non-numerical  tasks 
(Gentner  &  Toupin,  1985;  Holyoak,  Junn,  & 
Billman,  1984;  Kotovsky  8l  Gentner,  1996; 
Rattermann,  Gentner,  &  DeLoache,  1989). 
Thus,  children  may  have  difficulty  recogniz¬ 
ing  number  as  the  relevant  relation  when  the 
sets  being  compared  are  otherwise  very  differ¬ 
ent.  In  addition,  children’s  responses  in  numer¬ 
ical  equivalence  tasks  may  shift  from  an  em¬ 
phasis  on  surface  similarity  to  an  emphasis  on 
relational  similarity  over  development-!. c.,  the 
relational  shift  described  in  other  domains 
(Gentner,  1988,  Gentner  &  Rattermann,  1991). 
Finally,  knowledge  of  the  count  words  might 
improve  numerical  equivalence  judgments  just 


as  the  act  of  naming  has  helped  focus  children’s 
attention  on  category-relevant  dimensions  in 
other  domains  (Gentner  &  Rattermann,  1991 ; 
Smith,  1993). 

However,  current  views  of  number  devel¬ 
opment  paint  a  different  picture.  Reports  of  nu¬ 
merical  abstraction  in  infants,  as  well  as  other 
early  numerical  competencies,  have  led  to  the 
proposition  that  numerical  development  is  guid¬ 
ed  by  a  set  of  innate  domain-specific  princi¬ 
ples  (Gallistel  &  Gclman,  l992;Gclman,  1991). 
These  principles  are  supposed  to  provide  a 
structure  that  supports  and  promotes  numeri¬ 
cal  development.  If  so,  then  development  of 
numerical  equivalence  judgments  might  be 
immune  to  the  difficulties  children  encounter 
Judging  other  types  of  similarity. 

METHOD 

The  basic  procedure  involved  a  triad 
matching  task  in  which  preschool  children 
matched  a  target  set  with  2,  3,  or  4  items  to 
one  of  two  choice  cards  that  showed  an  equiv¬ 
alent  number  of  dots.  The  critical  manipula¬ 
tion  was  that  the  contents  of  the  target  sets 
varied  across  conditions.  In  one  condition,  the 
target  sets  were  nearly  identical  to  the  sets  on 
the  choice  cards  (dots-to-dots).  In  a  second 
condition,  the  target  sets  were  homogeneous 
groups  of  objects  that  were  different  from  the 
sets  on  the  choice  cards  (shells-to-dots).  In  the 
third  condition,  the  target  sets  were  heteroge¬ 
neous  sets  of  objects  that  also  differed  from 
the  sets  on  the  choice  cards  (random  objects- 
to-dots).  In  addition  to  these  matching  tasks, 
children  also  were  given  several  counting  tasks 
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to  assess  their  knowledge  of  the  conventional 
count  words. 

RESULTS 

There  was  a  clear  difference  in  perfor¬ 
mance  depending  on  which  comparison  chil¬ 
dren  were  making.  First,  the  conditions  with 
less  surface  similarity  were  significantly  more 
difficult  than  the  literal  dots-to-dots  condition 
across  age  (Shells-to-dots  vs.  dots-to  dots: 
F(l,28)  =  8.71,  p  <  .01;  Random  objects-to- 
dots  vs.  dots-to  dots:  F(l,42)  =  28.74,  p  < 
.0001).  This  is  consistent  with  work  in  other 
domains  showing  that  surface  similarity  af¬ 
fects  transfer  in  young  children. 

Second,  there  was  evidence  of  a  relational 
shift.  Children  performed  above  chance  on  the 
disks-to-dots  comparison  at  a  younger  age  than 
children  performed  above  chance  on  shells-to- 
dots  comparison.  Furthermore,  children  per¬ 
formed  above  chance  on  the  shells-to-dots  com¬ 
parison  at  a  younger  age  than  children  per¬ 
formed  above  chance  on  random  objects-to- 


dots.  Thus,  over  development,  children  gradu¬ 
ally  extended  their  equivalence  judgments  from 
comparisons  with  high  surface  similarity  to 
comparisons  with  only  relational  similarity. 

Third,  conventional  counting  ability  ap¬ 
peared  to  improve  performance.  Children  who 
were  competent  counters  performed  all  three 
matching  tasks  above  chance.  However,  chil¬ 
dren  who  were  not  competent  counters  per¬ 
formed  at  chance  on  the  shells-to-dots  and  ran¬ 
dom  objects-to-dots  comparisons.  Thus,  know¬ 
ing  the  verbal  labels  for  small  sets  may  aid  in 
transfer  for  less  literal  numerical  comparisons. 

CONCLUSIONS 

The  present  results  indicate  that  numerical 
equivalence  judgments  develop  much  like  oth¬ 
er  comparisons-inasmuch  as  surface  similari¬ 
ty  and  labeling  affect  performance.  In  contrast, 
the  present  findings  are  inconsistent  with  the 
view  that  development  of  number  concepts  is 
privileged  by  virtue  of  innate,  domain  specific 
knowledge  structures. 
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Analogical  reasoning,  an  important  cogni¬ 
tive  skill,  invoved  perceiving  similar  relation¬ 
ships  in  dissimilar  domains.  Early  theorists  be¬ 
lieved  that  true  analogical  reasoning  capability 
was  achieved  around  adolescence  and  that 
young  children  were  incapable  of  engaging  in 
analogical  reasoningand  transfer.  Analogical 
transfer  involves  ignoming  nonanalogous  in¬ 
formation,  extracting  relevant  analogous  infor¬ 
mation  from  one  particular  domain,  and  using 
it  to  answer  questions  or  solve  problems  in  a 
different  domain. 

However,  much  recent  research  has  dem¬ 
onstrated  not  only  early  analogical  reasoning, 
but  early  analogical  transfer  abilities  in  chil¬ 
dren.  Much  new  research  has  focused  on  chil¬ 
dren  three,  four,  and  five  years  old.  However, 
few  studies  occur  in  the  regular  classrom  or  seek 
to  illuminate  the  capabilties  of  elementary 
school-aged  children.  This  study  sought  to  ad- 
dres  these  issues. 

Seven  4th-grade  classes,  four  expeirmen- 
tal  and  three  ocntrol,  participated  in  a  group 
intervention  designed  to  train  students  in  ana¬ 
logical  reasoning  and  transfer.  The  training  was 
undergirded  by  principles  embraced  by  the 
knowledge-based  view  of  analogical  reasoning. 
This  perspective  holds  that  if  children  are  fa¬ 
miliar  with  the  objects  in  the  analogy  and  un¬ 


derstand  the  relations  between  the  items  in  the 
analogy,  they  will  have  no  difficulty  engaging 
in  analogial  solution  and  transfer. 

The  intervention  consisted  of  six  sessions: 
pretest,  metaphorical  story  presentation,  three 
training  sessions,  and  posttest.  During  the  sto¬ 
ry  presentation,  studetns  read  a  metaphorical 
story  that  served  a  a  tool  in  analogy  solution 
and  transfer.  The  two  A  groups  were  pretested 
and  trained  on  analogies  from  the  domains  of 
relations  (such  as  male/fcmale,  singleton/group, 
part/whole  and  sequence),  mathematics  and 
metaphors.  The  A  groups  were  posttested  on 
analogies  from  the  domains  of  word  forms 
(such  as  antonyms,  synonums,  palindromes  and 
homonyms),  story  problem  solving,  and  spa¬ 
tial  relations  analogies.  The  B  groups’  presen¬ 
tations  were  reversed. 

Analyses  revealed  significant  training  ef¬ 
fects  for  one  A  group  and  both  B  groups.  Singi- 
ficant  transfer  effects  were  demonstrated  for 
both  A  groups  and  one  B  group.  There  were  no 
significant  gender  related  differences  either 
group  in  the  posttesi  domains.  The  training  was 
an  effective  vehicle  for  teaching  children  both 
analogical  solution  and  analogical  transfer. 
Further  research  should  be  done  to  refine  the 
training  program,  with  a  goal  of  implementa¬ 
tion  in  elementary  schools. 
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ABSTRACT 

This  paper  considers  from  a  philosophical 
perspective  the  idea  that  analogy  has  been  the 
principal  underlying  mechanism  in  the  evolu¬ 
tionary  development  of  language. 

The  view  that  language  is  fundamentally 
analogical  in  nature  is  increasingly  being  con¬ 
sidered  by  philosophers.  Evidence  for  this 
view  is  briefly  set  out,  in  the  form  of  the  wide¬ 
spread  systems  of  analogy  and  metaphor  re¬ 
cently  documented  by  Lakoff  and  Johnson, 
and  of  vocabulary  itself,  the  bulk  of  which 
shows  signs  of  having  been  formed  by  analo¬ 
gy-like  processes  of  construction.  The  evi¬ 
dence  is  substantial  enough  to  prompt  the  hy¬ 
pothesis  on  which  the  paper  centres,  that  such 
analogical  construction  has  been  the  dominant 
evolutionary  process  in  language. 

The  paper  proceeds  to  examine  the  philo¬ 
sophical  implications  of  this  idea,  and  in  the 
course  of  so  doing  develops  a  theory  of  lan¬ 
guage  evolution  called  Analogical  Generali sm 
which  takes  the  idea  as  one  of  its  central  con¬ 
cepts.  In  considering  Analogical  Generalism  the 
paper  does  not  concern  itself  with  individual 
historical  languages,  but  rather  the  overall 
trends  of  language  evolution  which  the  theory 
implies  and  which,  if  the  theoiy  is  correct,  must 
have  been  instantiated  in  the  actual  develop¬ 
ment  of  all  historical  languages. 

A  concept  of  ‘articulation’  is  introduced. 
This  is  the  characteristic  of  words  which  makes 
some  display  more  structure  in  expressing 
meaning  than  others.  It  is  argued  that  some 
words  are  ‘articulatively  general’,  having  little 
or  no  expressive  structure,  while  others  are  ‘ar¬ 
ticulatively  complex*.  This  concept  is  related 
to  the  complexity  of  the  unconscious  linguistic 


knowledge  users  bring  to  understanding  the 
sense  of  words. 

It  is  thenb  argued  that  analogical  construc¬ 
tion  of  new  vocabulary  can  only  give  rise  to 
words  of  greater  articulative  complexity  than 
their  source  terms.  This  means  that  a  language 
evolution  dominated  by  this  process  must  have 
developed  broadly  from  articulatively  general 
terms  towards  more  precise,  articulatively  com¬ 
plex  ones. 

An  evolutionary  trend  towards  increasing 
expressive  complexity  over  time  implies  that 
language  must  have  had  its  origins  in  articula¬ 
tively  highly  general  terms.  This  concept  intro¬ 
duces  the  ‘Generalist’  component  of  the  theo¬ 
ry  developed  in  the  paper.  It  is  argued  that  there 
exist  even  in  modem  languages  certain  words 
of  absolutely  minimal  articulative  structure. 
These  ‘primal  words’  can  typically  be  substi¬ 
tuted  for  by  gestures,  and  as  such  may  repre¬ 
sent  a  missing  link  between  animal  communi¬ 
cation  and  modem  human  language.  It  is  ar¬ 
gued  that  examples  of  mammalian  communi¬ 
cation  such  as  the  barking  of  dogs  can  plausi¬ 
bly  be  thought  of  as  expressing  meaning  at  he 
same  minimal  level  of  articulation  as  human 
primal  terms  -  indeed  that  in  certain  cases  the 
meaning  expressed  may  itself  be  identical  to 
that  expressed  by  human  primal  words. 

Other  aspects  of  the  Generalist  position 
about  language  origins  are  explored,  and  con¬ 
trasted  with  the  more  conventional  picture 
which  sees  articulation  as  an  invariable  con¬ 
stant  in  language.  In  various  ways  it  is  shown 
that  this  new  approach  represents  a  superior 
position  to  the  rather  naive  Articulative  Atom¬ 
ism  of  the  latter.  This  is  most  particularly  so  in 
the  fact  that  it  is  not  committed  to  any  radical 
discontinuity  in  the  early  development  of  mean- 
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ingful  language.  The  Generalist  view  makes  it 
possible  to  understand  the  evolution  of  the  ear¬ 
liest  words  as  end  products  of  a  continuous  and 
progressive  development  of  the  primal  language 
forms  of  higher  primates  and  early  hominids. 

The  Generalist  origins  that  are  implied  by 
the  trends  which  would  be  imposed  on  lan¬ 
guage  development  by  a  dominant  process  of 
analogical  construction  solve,  then,  some  of 


the  more  intransigent  problems  concerning  the 
origins  of  language.  It  is  concluded  that  the 
hypothesis  of  the  dominance  of  analogical 
construction,  together  with  a  Generalist  ac¬ 
count  of  language  origins,  Is  from  a  philosoph¬ 
ical  perspective  sound.  It  is  therefore  proposed 
that  the  outline  of  a  coherent  account  of  lan¬ 
guage  evolution  has  become  evident  in  the 
theory  of  Analogical  Generalism. 
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Research  within  the  paradigms  of  concep¬ 
tual  metaphors  and  discourse  analysis  is  brought 
together  in  this  study  of  the  patterns  and  strate¬ 
gies  of  euphemization  of  ‘death*  as  a  taboo  top¬ 
ic.  The  phenomenon  of  euphemization  tradi¬ 
tionally  considered  from  an  isolated  lexico-se- 
mantic  point  of  view  is  explored  here  within  a 
combined  model  of  metaphoric  patterns  and 
discursive  strategies.  Conceptualization  of 
‘death*  is  carried  out  along  a  number  of  dimen¬ 
sions  such  as:  individual  vs.  universal  experi¬ 
ence,  controlled  vs.  uncontrolled,  irreversible 
vs.  reversible,  gradual  vs.  sudden  (expected  vs. 
unexpected),  event  vs.  state,  etc.  It  is  based  on 
a  range  of  conceptual  metaphors-ontological, 
structural,  orientational.  The  choice  of  meta¬ 
phorical  pattern  highlighting  certain  orienta¬ 
tions  within  the  various  dimensions  serves  eu¬ 
phemistic  purposes.  Thus,  euphemistic  is  the 


preferred  use  of  one  underlying  conceptual 
metaphor  instead  of  another  in  the  construal  of 
the  concept  of  death  (e.g.,  Death-as-Joumey  vs. 
Death-as-Struggle).  Discourse  structure  is  ex¬ 
amined  in  texts  employing  a  set  of  strategies 
which  exploit  certain  aspects  of  conceptual 
structure  as  identified  above  for  purposes  re¬ 
lated  to  the  psychological  motivation  of  the 
usage  of  euphemization  (general  models  of 
human  coping  behaviour)  as  well  as  communi¬ 
cative  goals  which  reflect  situational  charac¬ 
teristics,  e.g.,  text  genre.  Different  discourse- 
framing  devices  are  used.  Thus,  the  study  re¬ 
veals  the  existence  of  a  systematic  relationship 
between  the  patterns  of  selective  highlighting 
of  conceptual  structure  and  discourse  construc¬ 
tive  strategies  which  constitute  euphemization 
as  a  psychologically  and  communicatively 
motivated  phenomenon. 
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